merge: develop into fix/1362-repair-sqlite-integrity-preflight
Conflicts opened by #1285 (temp-staging rebuild) and #1312 (collection_name in recovery paths) merging after this branch was authored. mempalace/repair.py: - Kept this branch's sqlite_integrity_errors() and print_sqlite_integrity_abort() helpers; took develop's rebuild_index signature with the collection_name parameter from #1312. Normalized the helper's print indent to 2 spaces to match the rest of the file. tests/test_repair.py: - Kept both this branch's sqlite_integrity_errors tests and develop's rebuild_from_sqlite + configured-collection coverage. - Replaced 7 sites of sqlite_path.write_text("fake") with sqlite3.connect(...).close() — write_text("fake") fails PRAGMA quick_check, so the new preflight aborts before the rebuild logic the tests actually exercise. An empty real SQLite DB passes quick_check and lets the tests run as intended. - Took develop's temp-staging assertion shape (delete/create the __repair_tmp collection in addition to the live drawers collection) for the existing test_rebuild_index_success test. Local: 1618 tests pass, ruff lint+format clean against 0.4.x CI pin.
This commit is contained in:
@@ -36,3 +36,4 @@ venv/
|
|||||||
|
|
||||||
# ChromaDB local data
|
# ChromaDB local data
|
||||||
*.sqlite3-journal
|
*.sqlite3-journal
|
||||||
|
.envrc
|
||||||
|
|||||||
@@ -11,6 +11,14 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|||||||
### Bug Fixes
|
### Bug Fixes
|
||||||
|
|
||||||
- **`mempalace_diary_read` silently dropped entries on agent-name case mismatch.** `tool_diary_write` stored the `agent` metadata verbatim after `sanitize_name`, which preserves case, while `tool_diary_read` filtered by exact match. Writing as `"Claude"` and reading as `"claude"` (or vice-versa) returned zero rows. Both endpoints now lowercase `agent_name` immediately after sanitization, so reads are case-insensitive and the default per-agent wing slug is stable across casings. **Behavior change:** entries written prior to this fix under mixed-case agent names will not match the new lowercase filter; run `mempalace repair` if you need to migrate legacy diary metadata. (#1243)
|
- **`mempalace_diary_read` silently dropped entries on agent-name case mismatch.** `tool_diary_write` stored the `agent` metadata verbatim after `sanitize_name`, which preserves case, while `tool_diary_read` filtered by exact match. Writing as `"Claude"` and reading as `"claude"` (or vice-versa) returned zero rows. Both endpoints now lowercase `agent_name` immediately after sanitization, so reads are case-insensitive and the default per-agent wing slug is stable across casings. **Behavior change:** entries written prior to this fix under mixed-case agent names will not match the new lowercase filter; run `mempalace repair` if you need to migrate legacy diary metadata. (#1243)
|
||||||
|
- **Knowledge-graph triples with `valid_to < valid_from` were silently invisible.** `KnowledgeGraph.query_entity()` filters with `valid_from <= as_of AND valid_to >= as_of`, so an inverted interval matches no `as_of` and the row is durably stored but unreachable — a P0 data-integrity foot-gun any caller that mixes up the two date params can hit. `add_triple()` now rejects inverted intervals at write time with a clear `ValueError` naming both bounds. Open intervals (one bound only) and point-in-time facts (`valid_from == valid_to`) remain accepted unchanged. (#1214)
|
||||||
|
- **`ChromaBackend.close_palace()` / `close()` did not release the SQLite file lock.** Evicted clients sat in `_clients` without `close()`, and chromadb 1.5.x retains the rust-side SQLite lock until GC. Reopening the same palace path after `shutil.rmtree` + recreate within one process failed with `SQLITE_READONLY_DBMOVED` (code 1032). New `_close_client()` helper now calls `PersistentClient.close()` (with a try/except fallback for older chromadb) on `close_palace()`, on whole-backend `close()`, and on the `_client()` invalidation path that detects a missing `chroma.sqlite3`. The mtime/inode auto-invalidation branch is intentionally left alone — callers there may still hold a live `ChromaCollection`. (#1067, #1105)
|
||||||
|
- **`EntityRegistry.save()` could leave a corrupt or empty `entity_registry.json` on crash.** `Path.write_text()` is not atomic — kernel sees `open('w')` (truncate), `write`, `close`, and any failure between truncate and full-flush (power loss, OOM, FS-full, kill -9) wipes the months-of-mining people/projects map silently (the registry's `load()` swallows `JSONDecodeError`). Save now writes to a sibling `.tmp` in the same directory, `fsync`s, `chmod 0o600`s, then `os.replace()`s into place — atomic on POSIX and Windows. The previous registry stays intact on any crash before the rename returns. (#1215)
|
||||||
|
- **`mempalace compress` crashed on large palaces.** `regenerate_closets` fetched all closet_llm drawers in a single `col.get()`, which trips `SQLITE_MAX_VARIABLE_NUMBER` on palaces above ~32k drawers. Mirrors the #851 fix in `miner.py`: drawer fetch is now paginated at `batch_size=5000`. Per-source aggregation works across batches, so the LLM regeneration call still groups chunks correctly. (#1073, #1107)
|
||||||
|
- **CLI and `fact_checker --stdin` mojibaked non-ASCII content on Windows.** Python defaults `sys.stdin`/`stdout`/`stderr` to the system ANSI codepage (cp1252/cp1251/cp950), so `mempalace search > out.txt` and piped fact_checker invocations corrupted Cyrillic / CJK drawer text at the process boundary. New `mempalace/_stdio.py` helper reconfigures all three streams to UTF-8 on `sys.platform == "win32"`, with per-stream `errors` policy: `surrogateescape` on stdin (preserves bad bytes from redirected files for the consumer's parser), `replace` on stdout/stderr (substitutes U+FFFD instead of `UnicodeEncodeError`-ing mid-print). With this, all three user-facing console_scripts (`mcp_server`, `hooks_cli`, `cli`/`fact_checker`) now reconfigure identically on Windows. (#1282)
|
||||||
|
- **MCP knowledge-graph tools forwarded malformed date strings to SQLite.** `tool_kg_query` (`as_of`), `tool_kg_add` (`valid_from`), and `tool_kg_invalidate` (`ended`) accepted any string and produced empty result sets on natural-language inputs like `"March 2026"` or `"yesterday"` — callers (especially LLM agents) could not distinguish "no fact at this time" from "your date format was unrecognized." New `sanitize_iso_date()` validator in `config.py` accepts `YYYY`, `YYYY-MM`, `YYYY-MM-DD` (and passes through `None`/`""`); all three tools call it before values reach the storage layer. **Behavior change:** previously-silent date typos now raise a clear `ValueError` naming the offending field; full ISO-8601 with time (`YYYY-MM-DDTHH:MM:SS`, timezone offsets) is not yet accepted — file an issue if you have a use case. (#1164, #1167)
|
||||||
|
- **MCP server's `_kg` was a module-level singleton.** Multi-tenant hosts that rotate `MEMPALACE_PALACE_PATH` between tool calls hit the wrong sqlite file, because the KG was constructed once at import time while the ChromaDB side was already per-call via `_get_client()`. The KG is now resolved per-call through a lazy per-path cache (`_kg_by_path` keyed by `os.path.abspath`, with a double-checked-locking init under `_kg_cache_lock`). `tool_reconnect` drains and `close()`s cached KGs alongside the existing chroma reconnect. A `_call_kg` retry guard catches `sqlite3.ProgrammingError` once after a reconnect race. (#1136, #1160)
|
||||||
|
- **`mempalace repair` can now recover palaces whose HNSW segment writer is stuck on `apply_logs`.** Both the existing `--mode legacy` rebuild and the inline `cli.cmd_repair` path call `Collection.count()` as their first read — exactly the call that raises `chromadb.errors.InternalError: Failed to apply logs to the hnsw segment writer` on the corruption class introduced upstream and reported in #1308. Repair would print `Cannot recover — palace may need to be re-mined from source files` even though the underlying SQLite tables were fully intact (the corruption lives in the on-disk index files, not the data layer). New `--mode from-sqlite` reads `(id, document, metadata)` rows directly from `chroma.sqlite3` via a `segments` → `embeddings` → `embedding_metadata` join, never opens a chromadb client against the corrupt palace, and re-upserts everything into a fresh palace at `--palace`. `--source PATH` extracts from a corrupt palace already moved aside; `--archive-existing` handles the in-place case by renaming the existing palace to `<palace>.pre-rebuild-<timestamp>` before reading from it. Documents are re-embedded under the user's configured embedding function (the original HNSW vectors live in the corrupt `data_level0.bin` and cannot be recovered, but the embedding model is deterministic so search results remain semantically equivalent). Verified end-to-end on a 52,300-row real-world corrupt palace. (#1308)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -28,6 +36,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|||||||
### Bug Fixes
|
### Bug Fixes
|
||||||
|
|
||||||
- **MCP server `tool_diary_write` SIGSEGV when default EF provider differs.** `mcp_server._get_collection` bypassed `ChromaBackend.get_collection` and called `client.get_collection` / `client.create_collection` without `embedding_function=`. ChromaDB 1.x persists the EF *identity* (its `name()`) with the collection but not the EF *instance/configuration*, so the MCP server's reopen silently bound chromadb's built-in `DefaultEmbeddingFunction` — its `name()` matches `mempalace.embedding`'s spoofed `"default"` so the identity check passes, but its provider list is chromadb's default rather than the user's resolved device. The miner / Stop hook ingest path routes through the backend helper and binds the configured EF instead. On bleeding-edge interpreters (python 3.14 + chromadb 1.5.x on Apple Silicon) the default provider selection could SIGSEGV the host process on first `col.add()`, killing the MCP stdio server and leaving every subsequent tool call returning `Connection closed` until Claude Code was relaunched. `_get_collection` now reuses `ChromaBackend._resolve_embedding_function()` on the reopen branches that actually open a collection (warm-cache reads stay zero-cost), matching the miner/backend path. (#1299, follow-up to #1262 / #1289)
|
- **MCP server `tool_diary_write` SIGSEGV when default EF provider differs.** `mcp_server._get_collection` bypassed `ChromaBackend.get_collection` and called `client.get_collection` / `client.create_collection` without `embedding_function=`. ChromaDB 1.x persists the EF *identity* (its `name()`) with the collection but not the EF *instance/configuration*, so the MCP server's reopen silently bound chromadb's built-in `DefaultEmbeddingFunction` — its `name()` matches `mempalace.embedding`'s spoofed `"default"` so the identity check passes, but its provider list is chromadb's default rather than the user's resolved device. The miner / Stop hook ingest path routes through the backend helper and binds the configured EF instead. On bleeding-edge interpreters (python 3.14 + chromadb 1.5.x on Apple Silicon) the default provider selection could SIGSEGV the host process on first `col.add()`, killing the MCP stdio server and leaving every subsequent tool call returning `Connection closed` until Claude Code was relaunched. `_get_collection` now reuses `ChromaBackend._resolve_embedding_function()` on the reopen branches that actually open a collection (warm-cache reads stay zero-cost), matching the miner/backend path. (#1299, follow-up to #1262 / #1289)
|
||||||
|
- **Hooks no longer recreate `~/.mempalace/` after the user removes it.** When `~/.mempalace/` is deleted (a strong "do not auto-capture" signal), the next `Stop`, `PreCompact`, or `SessionStart` hook would silently rebuild the dir hierarchy and ingest existing transcripts: `_log()` called `STATE_DIR.mkdir(parents=True, exist_ok=True)` unconditionally, so the very act of writing `[HH:MM] SESSION START …` recreated `~/.mempalace/hook_state/`; subsequent calls in the save path then materialized `palace/`, `wal/`, `knowledge_graph.sqlite3`, and N drawers from `~/.claude/projects/*.jsonl`. All four entry points (`hook_stop`, `hook_precompact`, `hook_session_start`, and `_log` itself) now check a new module-level `PALACE_ROOT = Path.home() / ".mempalace"` constant first and short-circuit (returning `{}` on stdout, never logging) when the directory is absent. The user-removable directory becomes a kill-switch — `rm -rf ~/.mempalace` is now a stable state. Net: 23 lines added in `mempalace/hooks_cli.py`, 5 unit tests in `tests/test_hooks_cli.py`. (#1305)
|
||||||
- **Cross-wing topic tunnels for hyphenated dir names.** `mempalace init` recorded the `topics_by_wing` registry key under the raw directory name (e.g. `mempalace-public`), while `mempalace.yaml`'s `wing` field used the lower-cased + separator-collapsed slug (`mempalace_public`). At mine time the miner read the slug from the yaml and missed the registry, so `_compute_topic_tunnels_for_wing` returned `0` silently. Real-world: any project whose folder contained a hyphen or space lost every topic tunnel. Now both call sites route through a shared `normalize_wing_name()` in `config.py`. (#1194, follow-up to #1180)
|
- **Cross-wing topic tunnels for hyphenated dir names.** `mempalace init` recorded the `topics_by_wing` registry key under the raw directory name (e.g. `mempalace-public`), while `mempalace.yaml`'s `wing` field used the lower-cased + separator-collapsed slug (`mempalace_public`). At mine time the miner read the slug from the yaml and missed the registry, so `_compute_topic_tunnels_for_wing` returned `0` silently. Real-world: any project whose folder contained a hyphen or space lost every topic tunnel. Now both call sites route through a shared `normalize_wing_name()` in `config.py`. (#1194, follow-up to #1180)
|
||||||
- **CLI `mempalace search` retrieval quality.** The CLI was using pure ChromaDB cosine distance with no BM25 rerank, so drawers containing every query term but embedding as noise (directory listings, diff output, shell logs) scored `Match: 0.0` alongside genuinely irrelevant results with no way to tell them apart. Wired the CLI through the same `_hybrid_rank` the `mempalace_search` MCP tool already used, and surfaced both `cosine=` and `bm25=` scores in the output so users see which component of the match is firing. MCP search was unaffected; this fixes the human-facing CLI parity gap.
|
- **CLI `mempalace search` retrieval quality.** The CLI was using pure ChromaDB cosine distance with no BM25 rerank, so drawers containing every query term but embedding as noise (directory listings, diff output, shell logs) scored `Match: 0.0` alongside genuinely irrelevant results with no way to tell them apart. Wired the CLI through the same `_hybrid_rank` the `mempalace_search` MCP tool already used, and surfaced both `cosine=` and `bm25=` scores in the output so users see which component of the match is firing. MCP search was unaffected; this fixes the human-facing CLI parity gap.
|
||||||
- **Legacy-palace distance-metric warning.** CLI search now detects palaces created before `hnsw:space=cosine` was consistently set and prints a one-line notice pointing at `mempalace repair`. Without the warning such palaces silently used L2 distance, under which the similarity display floored every result to `Match: 0.0`. New palaces mined today already set cosine correctly and now have invariant tests pinning that behavior so future refactors can't silently regress it. (#1179)
|
- **Legacy-palace distance-metric warning.** CLI search now detects palaces created before `hnsw:space=cosine` was consistently set and prints a one-line notice pointing at `mempalace repair`. Without the warning such palaces silently used L2 distance, under which the similarity display floored every result to `Match: 0.0`. New palaces mined today already set cosine correctly and now have invariant tests pinning that behavior so future refactors can't silently regress it. (#1179)
|
||||||
|
|||||||
@@ -6,6 +6,10 @@
|
|||||||
> domain — including `mempalace.tech` — is an impostor and may distribute
|
> domain — including `mempalace.tech` — is an impostor and may distribute
|
||||||
> malware. Details and timeline: [docs/HISTORY.md](docs/HISTORY.md).
|
> malware. Details and timeline: [docs/HISTORY.md](docs/HISTORY.md).
|
||||||
|
|
||||||
|
> [!IMPORTANT]
|
||||||
|
> **🚨 Claude Code sessions expire in 30 days w/out auto-save hooks wired!** **[Read this →](https://github.com/MemPalace/mempalace/discussions/1388)**
|
||||||
|
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
|
|
||||||
<img src="assets/mempalace_logo.png" alt="MemPalace" width="240">
|
<img src="assets/mempalace_logo.png" alt="MemPalace" width="240">
|
||||||
|
|||||||
@@ -0,0 +1,71 @@
|
|||||||
|
"""Stdio UTF-8 reconfiguration helper for Windows entry points.
|
||||||
|
|
||||||
|
Python on Windows defaults stdio to the system ANSI codepage
|
||||||
|
(cp1252/cp1251/cp950 depending on locale), which mojibakes UTF-8 input
|
||||||
|
or output the moment a non-Latin character shows up. Every console
|
||||||
|
entry point that touches stdio needs to fix this on Windows -- the MCP
|
||||||
|
server, the CLI, the fact_checker `--stdin` mode -- so the
|
||||||
|
reconfigure code lives here in one place to keep the per-stream
|
||||||
|
errors policies aligned across them.
|
||||||
|
|
||||||
|
Per-stream errors policy is caller-chosen:
|
||||||
|
|
||||||
|
* MCP server uses ``strict`` on stdout/stderr because everything written
|
||||||
|
there is server-controlled JSON-RPC; any encode failure is a real bug
|
||||||
|
the operator wants loud.
|
||||||
|
* CLI / fact_checker use ``replace`` on stdout/stderr because they print
|
||||||
|
verbatim drawer text that may contain surrogate halves round-tripped
|
||||||
|
from filenames -- ``strict`` would crash mid-print.
|
||||||
|
* All callers use ``surrogateescape`` on stdin so a malformed byte from
|
||||||
|
a redirected file or a misbehaving client survives as a lone surrogate
|
||||||
|
the consumer's parser surfaces, instead of ``UnicodeDecodeError``
|
||||||
|
killing the read loop on the first bad byte.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import sys
|
||||||
|
from typing import Callable, Optional
|
||||||
|
|
||||||
|
|
||||||
|
def reconfigure_stdio_utf8_on_windows(
|
||||||
|
*,
|
||||||
|
stdin_errors: str = "surrogateescape",
|
||||||
|
stdout_errors: str = "strict",
|
||||||
|
stderr_errors: str = "strict",
|
||||||
|
on_failure: Optional[Callable[[str, BaseException], None]] = None,
|
||||||
|
) -> None:
|
||||||
|
"""Reconfigure stdio to UTF-8 on Windows. No-op elsewhere.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
stdin_errors: errors= policy for stdin.reconfigure().
|
||||||
|
stdout_errors: errors= policy for stdout.reconfigure().
|
||||||
|
stderr_errors: errors= policy for stderr.reconfigure().
|
||||||
|
on_failure: optional ``(stream_name, exc) -> None`` callback for
|
||||||
|
streams whose ``reconfigure`` raises (e.g. Jupyter-replaced
|
||||||
|
streams that lack the method-shape we expect). Defaults to a
|
||||||
|
``WARNING:`` line on the original sys.stderr.
|
||||||
|
"""
|
||||||
|
if sys.platform != "win32":
|
||||||
|
return
|
||||||
|
|
||||||
|
policies = (
|
||||||
|
("stdin", stdin_errors),
|
||||||
|
("stdout", stdout_errors),
|
||||||
|
("stderr", stderr_errors),
|
||||||
|
)
|
||||||
|
for name, errors in policies:
|
||||||
|
stream = getattr(sys, name, None)
|
||||||
|
reconfigure = getattr(stream, "reconfigure", None)
|
||||||
|
if reconfigure is None:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
reconfigure(encoding="utf-8", errors=errors)
|
||||||
|
except Exception as exc: # noqa: BLE001 -- last-resort guard
|
||||||
|
if on_failure is not None:
|
||||||
|
on_failure(name, exc)
|
||||||
|
else:
|
||||||
|
print(
|
||||||
|
f"WARNING: Could not reconfigure {name} to UTF-8: {exc}",
|
||||||
|
file=sys.stderr,
|
||||||
|
)
|
||||||
+293
-95
@@ -1,9 +1,12 @@
|
|||||||
"""ChromaDB-backed MemPalace storage backend (RFC 001 reference implementation)."""
|
"""ChromaDB-backed MemPalace storage backend (RFC 001 reference implementation)."""
|
||||||
|
|
||||||
|
import contextlib
|
||||||
import datetime as _dt
|
import datetime as _dt
|
||||||
import logging
|
import logging
|
||||||
import os
|
import os
|
||||||
|
import pickle
|
||||||
import sqlite3
|
import sqlite3
|
||||||
|
from numbers import Integral
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Any, Optional
|
from typing import Any, Optional
|
||||||
|
|
||||||
@@ -29,6 +32,51 @@ _REQUIRED_OPERATORS = frozenset({"$eq", "$ne", "$in", "$nin", "$and", "$or", "$c
|
|||||||
_OPTIONAL_OPERATORS = frozenset({"$gt", "$gte", "$lt", "$lte"})
|
_OPTIONAL_OPERATORS = frozenset({"$gt", "$gte", "$lt", "$lte"})
|
||||||
_SUPPORTED_OPERATORS = _REQUIRED_OPERATORS | _OPTIONAL_OPERATORS
|
_SUPPORTED_OPERATORS = _REQUIRED_OPERATORS | _OPTIONAL_OPERATORS
|
||||||
|
|
||||||
|
# A healthy HNSW payload should keep link_lists.bin proportional to
|
||||||
|
# data_level0.bin. When link_lists.bin grows orders of magnitude larger than
|
||||||
|
# data_level0.bin, Chroma/HNSW can segfault while opening the segment even if
|
||||||
|
# index_metadata.pickle is structurally valid.
|
||||||
|
#
|
||||||
|
# The report in #1218 showed ratios above 300x, while healthy snapshots were far below 1x.
|
||||||
|
# Treat only >10x as corruption so normal flush lag or small segments do not get
|
||||||
|
# quarantined.
|
||||||
|
_HNSW_LINK_TO_DATA_MAX_RATIO = 10.0
|
||||||
|
|
||||||
|
|
||||||
|
def _hnsw_link_to_data_ratio(seg_dir: str) -> Optional[float]:
|
||||||
|
"""Return link_lists.bin / data_level0.bin size ratio for a segment.
|
||||||
|
|
||||||
|
``None`` means the ratio is not meaningful, usually because one file is
|
||||||
|
missing or data_level0.bin is empty. ``float("inf")`` means the files were
|
||||||
|
present but could not be statted safely, which should be treated as
|
||||||
|
suspicious by callers.
|
||||||
|
"""
|
||||||
|
|
||||||
|
link_path = os.path.join(seg_dir, "link_lists.bin")
|
||||||
|
data_path = os.path.join(seg_dir, "data_level0.bin")
|
||||||
|
|
||||||
|
if not (os.path.isfile(link_path) and os.path.isfile(data_path)):
|
||||||
|
return None
|
||||||
|
|
||||||
|
try:
|
||||||
|
data_size = os.path.getsize(data_path)
|
||||||
|
link_size = os.path.getsize(link_path)
|
||||||
|
except OSError:
|
||||||
|
return float("inf")
|
||||||
|
|
||||||
|
if data_size <= 0:
|
||||||
|
return None
|
||||||
|
|
||||||
|
return link_size / data_size
|
||||||
|
|
||||||
|
|
||||||
|
def _hnsw_payload_appears_sane(seg_dir: str) -> bool:
|
||||||
|
"""Return False when HNSW payload files are structurally implausible."""
|
||||||
|
|
||||||
|
ratio = _hnsw_link_to_data_ratio(seg_dir)
|
||||||
|
return ratio is None or ratio <= _HNSW_LINK_TO_DATA_MAX_RATIO
|
||||||
|
|
||||||
|
|
||||||
# HNSW tuning to prevent link_lists.bin bloat on large mines (#344).
|
# HNSW tuning to prevent link_lists.bin bloat on large mines (#344).
|
||||||
#
|
#
|
||||||
# With default params (batch_size=100, sync_threshold=1000, initial capacity
|
# With default params (batch_size=100, sync_threshold=1000, initial capacity
|
||||||
@@ -106,6 +154,9 @@ def _segment_appears_healthy(seg_dir: str) -> bool:
|
|||||||
files and quarantine_stale_hnsw would conservatively rename them
|
files and quarantine_stale_hnsw would conservatively rename them
|
||||||
out of the way (lazy rebuild on next open recovers).
|
out of the way (lazy rebuild on next open recovers).
|
||||||
"""
|
"""
|
||||||
|
if not _hnsw_payload_appears_sane(seg_dir):
|
||||||
|
return False
|
||||||
|
|
||||||
meta_path = os.path.join(seg_dir, "index_metadata.pickle")
|
meta_path = os.path.join(seg_dir, "index_metadata.pickle")
|
||||||
if not os.path.isfile(meta_path):
|
if not os.path.isfile(meta_path):
|
||||||
# No metadata file yet — segment hasn't flushed (fresh / empty).
|
# No metadata file yet — segment hasn't flushed (fresh / empty).
|
||||||
@@ -127,64 +178,35 @@ def _segment_appears_healthy(seg_dir: str) -> bool:
|
|||||||
|
|
||||||
|
|
||||||
def quarantine_stale_hnsw(palace_path: str, stale_seconds: float = 300.0) -> list[str]:
|
def quarantine_stale_hnsw(palace_path: str, stale_seconds: float = 300.0) -> list[str]:
|
||||||
"""Rename HNSW segment dirs that are both stale-by-mtime AND fail an
|
"""Rename HNSW segment dirs that look unsafe to open.
|
||||||
integrity sniff-test.
|
|
||||||
|
|
||||||
Catches the segfault failure mode from #823 (semantic search stale
|
This catches two classes of HNSW corruption before ChromaDB opens the
|
||||||
after ``add_drawer``), observed at neo-cortex-mcp#2 (SIGSEGV on
|
native segment reader:
|
||||||
``count()`` with chromadb 1.5.5), and acknowledged as by-design at
|
|
||||||
chroma-core/chroma#2594. Renaming a corrupt segment lets chromadb
|
|
||||||
rebuild lazily on next open instead of segfaulting.
|
|
||||||
|
|
||||||
Two-stage check:
|
1. stale-by-mtime segments whose ``index_metadata.pickle`` fails the
|
||||||
|
existing format sniff-test;
|
||||||
|
2. structurally impossible HNSW payloads where ``link_lists.bin`` is much
|
||||||
|
larger than ``data_level0.bin``.
|
||||||
|
|
||||||
1. **mtime gate.** If ``chroma.sqlite3`` is less than
|
The second check is intentionally not gated by mtime. A segment with a
|
||||||
``stale_seconds`` newer than the segment's ``data_level0.bin``,
|
300x link/data ratio is unsafe regardless of whether its mtime is recent;
|
||||||
skip — chromadb is in normal write-path territory.
|
letting Chroma open it can SIGSEGV before Python fallback code runs.
|
||||||
|
|
||||||
2. **Integrity gate** (``_segment_appears_healthy``). Even when the
|
The original directory is renamed, not deleted, so recovery remains
|
||||||
mtime gap exceeds the threshold, a segment whose
|
possible if the heuristic ever misfires.
|
||||||
``index_metadata.pickle`` passes a format sniff-test is healthy:
|
|
||||||
chromadb 1.5.x flushes HNSW state asynchronously and a clean
|
|
||||||
shutdown does NOT force-flush, so the on-disk HNSW is *always*
|
|
||||||
somewhat older than ``chroma.sqlite3``. Production observation
|
|
||||||
(2026-04-26 disks daemon): three of three segments quarantined
|
|
||||||
on every cold start, with 538-557s gaps, leaving the 151K-drawer
|
|
||||||
palace with vector_ranked=0 until rebuild. Renaming a healthy
|
|
||||||
segment based on mtime alone destroys a valid index — chromadb
|
|
||||||
creates an empty replacement, orphaning every drawer in sqlite
|
|
||||||
from vector recall until the operator runs ``mempalace repair
|
|
||||||
--mode rebuild`` (15+ min on a 151K palace).
|
|
||||||
|
|
||||||
Only segments that pass stage 1 (suspiciously stale) AND fail stage
|
|
||||||
2 (metadata file truncated, zero-filled, or absent-with-data) are
|
|
||||||
renamed to ``<uuid>.drift-<timestamp>``. The original directory is
|
|
||||||
renamed, not deleted, so recovery remains possible if the heuristic
|
|
||||||
misfires.
|
|
||||||
|
|
||||||
The default threshold (5 min) is advisory under daemon-strict; the
|
|
||||||
integrity gate is what actually distinguishes corruption from flush
|
|
||||||
lag. The threshold still matters for the cross-machine replication
|
|
||||||
case (#823), where it bounds how stale a Syncthing-replicated
|
|
||||||
segment can be before we look harder at it.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
palace_path: path to the palace directory containing ``chroma.sqlite3``
|
|
||||||
stale_seconds: minimum mtime gap to *consider* a segment for quarantine
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
List of paths that were quarantined (empty if nothing actually
|
|
||||||
looked corrupt).
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
db_path = os.path.join(palace_path, "chroma.sqlite3")
|
db_path = os.path.join(palace_path, "chroma.sqlite3")
|
||||||
if not os.path.isfile(db_path):
|
if not os.path.isfile(db_path):
|
||||||
return []
|
return []
|
||||||
|
|
||||||
try:
|
try:
|
||||||
sqlite_mtime = os.path.getmtime(db_path)
|
sqlite_mtime = os.path.getmtime(db_path)
|
||||||
except OSError:
|
except OSError:
|
||||||
return []
|
return []
|
||||||
|
|
||||||
moved: list[str] = []
|
moved: list[str] = []
|
||||||
|
|
||||||
try:
|
try:
|
||||||
entries = os.listdir(palace_path)
|
entries = os.listdir(palace_path)
|
||||||
except OSError:
|
except OSError:
|
||||||
@@ -193,29 +215,34 @@ def quarantine_stale_hnsw(palace_path: str, stale_seconds: float = 300.0) -> lis
|
|||||||
for name in entries:
|
for name in entries:
|
||||||
if "-" not in name or name.startswith(".") or ".drift-" in name:
|
if "-" not in name or name.startswith(".") or ".drift-" in name:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
seg_dir = os.path.join(palace_path, name)
|
seg_dir = os.path.join(palace_path, name)
|
||||||
if not os.path.isdir(seg_dir):
|
if not os.path.isdir(seg_dir):
|
||||||
continue
|
continue
|
||||||
|
|
||||||
hnsw_bin = os.path.join(seg_dir, "data_level0.bin")
|
hnsw_bin = os.path.join(seg_dir, "data_level0.bin")
|
||||||
if not os.path.isfile(hnsw_bin):
|
if not os.path.isfile(hnsw_bin):
|
||||||
continue
|
continue
|
||||||
|
|
||||||
try:
|
try:
|
||||||
hnsw_mtime = os.path.getmtime(hnsw_bin)
|
hnsw_mtime = os.path.getmtime(hnsw_bin)
|
||||||
except OSError:
|
except OSError:
|
||||||
continue
|
continue
|
||||||
if sqlite_mtime - hnsw_mtime < stale_seconds:
|
|
||||||
|
payload_ratio = _hnsw_link_to_data_ratio(seg_dir)
|
||||||
|
payload_corrupt = payload_ratio is not None and payload_ratio > _HNSW_LINK_TO_DATA_MAX_RATIO
|
||||||
|
|
||||||
|
if not payload_corrupt and sqlite_mtime - hnsw_mtime < stale_seconds:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# Stage 2: integrity gate. mtime drift is necessary but not
|
# Stage 2: integrity gate. Mtime drift alone is not corruption because
|
||||||
# sufficient — chromadb's async flush makes drift the steady-
|
# Chroma flushes HNSW asynchronously. A healthy metadata file proves the
|
||||||
# state condition. A healthy segment metadata file proves
|
# ordinary stale-by-mtime case is just flush lag.
|
||||||
# chromadb can open the segment without segfault; don't
|
if not payload_corrupt and _segment_appears_healthy(seg_dir):
|
||||||
# quarantine a healthy index.
|
|
||||||
if _segment_appears_healthy(seg_dir):
|
|
||||||
logger.info(
|
logger.info(
|
||||||
"HNSW mtime gap %.0fs on %s exceeds threshold but segment "
|
"HNSW mtime gap %.0fs on %s exceeds threshold but segment "
|
||||||
"metadata file is intact — flush-lag, not corruption. "
|
"metadata and payload size are intact — flush-lag, not "
|
||||||
"Leaving in place.",
|
"corruption. Leaving in place.",
|
||||||
sqlite_mtime - hnsw_mtime,
|
sqlite_mtime - hnsw_mtime,
|
||||||
seg_dir,
|
seg_dir,
|
||||||
)
|
)
|
||||||
@@ -223,17 +250,30 @@ def quarantine_stale_hnsw(palace_path: str, stale_seconds: float = 300.0) -> lis
|
|||||||
|
|
||||||
stamp = _dt.datetime.now().strftime("%Y%m%d-%H%M%S")
|
stamp = _dt.datetime.now().strftime("%Y%m%d-%H%M%S")
|
||||||
target = f"{seg_dir}.drift-{stamp}"
|
target = f"{seg_dir}.drift-{stamp}"
|
||||||
|
|
||||||
|
if payload_corrupt:
|
||||||
|
reason = (
|
||||||
|
f"link_lists.bin/data_level0.bin ratio {payload_ratio:.1f}x "
|
||||||
|
f"exceeds {_HNSW_LINK_TO_DATA_MAX_RATIO:.1f}x"
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
reason = (
|
||||||
|
f"sqlite {sqlite_mtime - hnsw_mtime:.0f}s newer than HNSW "
|
||||||
|
"and integrity check failed"
|
||||||
|
)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
os.rename(seg_dir, target)
|
os.rename(seg_dir, target)
|
||||||
moved.append(target)
|
moved.append(target)
|
||||||
logger.warning(
|
logger.warning(
|
||||||
"Quarantined corrupt HNSW segment %s (sqlite %.0fs newer than HNSW, integrity check failed); renamed to %s",
|
"Quarantined corrupt HNSW segment %s (%s); renamed to %s",
|
||||||
seg_dir,
|
seg_dir,
|
||||||
sqlite_mtime - hnsw_mtime,
|
reason,
|
||||||
target,
|
target,
|
||||||
)
|
)
|
||||||
except OSError:
|
except OSError:
|
||||||
logger.exception("Failed to quarantine corrupt HNSW segment %s", seg_dir)
|
logger.exception("Failed to quarantine corrupt HNSW segment %s", seg_dir)
|
||||||
|
|
||||||
return moved
|
return moved
|
||||||
|
|
||||||
|
|
||||||
@@ -489,22 +529,17 @@ def hnsw_capacity_status(palace_path: str, collection_name: str = "mempalace_dra
|
|||||||
divergence_floor = max(_HNSW_DIVERGENCE_FALLBACK_FLOOR, 2 * sync_threshold)
|
divergence_floor = max(_HNSW_DIVERGENCE_FALLBACK_FLOOR, 2 * sync_threshold)
|
||||||
|
|
||||||
if hnsw_count is None:
|
if hnsw_count is None:
|
||||||
# No pickle yet — segment hasn't persisted metadata. Could be
|
# No pickle yet, so this probe cannot measure HNSW capacity.
|
||||||
# fresh-but-unflushed (normal) or interrupted-mid-flush (bad).
|
# Chroma 1.5.x can have binary HNSW files without a flushed
|
||||||
# We can't distinguish without the pickle, so only flag
|
# metadata pickle; absence of the pickle alone is not proof that
|
||||||
# divergence when sqlite holds clearly more than two flush
|
# vector search is unusable or dangerous. Keep the status unknown
|
||||||
# windows worth — same threshold as the with-pickle path.
|
# so MCP does not globally disable vectors on an inconclusive
|
||||||
if sqlite_count > divergence_floor:
|
# signal. Corrupt/invalid metadata, when present, is handled by
|
||||||
out["status"] = "diverged"
|
# quarantine_invalid_hnsw_metadata before Chroma opens.
|
||||||
out["diverged"] = True
|
|
||||||
out["divergence"] = sqlite_count
|
|
||||||
out["message"] = (
|
out["message"] = (
|
||||||
f"sqlite holds {sqlite_count:,} embeddings but the HNSW segment "
|
"HNSW capacity unavailable: metadata has not been flushed; "
|
||||||
"has never flushed metadata — vector search will return nothing "
|
"leaving vector search enabled"
|
||||||
"until the segment is rebuilt. Run `mempalace repair`."
|
|
||||||
)
|
)
|
||||||
else:
|
|
||||||
out["message"] = "HNSW segment metadata not yet flushed; skipping"
|
|
||||||
return out
|
return out
|
||||||
|
|
||||||
divergence = sqlite_count - hnsw_count
|
divergence = sqlite_count - hnsw_count
|
||||||
@@ -591,6 +626,97 @@ def _pin_hnsw_threads(collection) -> None:
|
|||||||
_BLOB_FIX_MARKER = ".blob_seq_ids_migrated"
|
_BLOB_FIX_MARKER = ".blob_seq_ids_migrated"
|
||||||
|
|
||||||
|
|
||||||
|
def _valid_dimensionality(value: object) -> bool:
|
||||||
|
return isinstance(value, Integral) and not isinstance(value, bool) and int(value) > 0
|
||||||
|
|
||||||
|
|
||||||
|
def _persisted_metadata_fields(obj: object) -> tuple[object, object]:
|
||||||
|
if isinstance(obj, dict):
|
||||||
|
return obj.get("dimensionality"), obj.get("id_to_label")
|
||||||
|
return getattr(obj, "dimensionality", None), getattr(obj, "id_to_label", None)
|
||||||
|
|
||||||
|
|
||||||
|
def quarantine_invalid_hnsw_metadata(palace_path: str) -> list[str]:
|
||||||
|
"""Quarantine segment dirs whose ``index_metadata.pickle`` is unreadable or invalid.
|
||||||
|
|
||||||
|
Chroma's persisted HNSW metadata is untrusted disk state. If a segment has
|
||||||
|
labels but no valid positive dimensionality, current Chroma versions can
|
||||||
|
accept the pickle and crash later in the Rust loader. We rename the entire
|
||||||
|
segment out of the way before ``PersistentClient`` opens so Chroma can
|
||||||
|
rebuild cleanly instead of touching known-bad metadata.
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
entries = os.listdir(palace_path)
|
||||||
|
except OSError:
|
||||||
|
return []
|
||||||
|
|
||||||
|
moved: list[str] = []
|
||||||
|
for name in entries:
|
||||||
|
if "-" not in name or name.startswith(".") or ".drift-" in name or ".corrupt-" in name:
|
||||||
|
continue
|
||||||
|
seg_dir = os.path.join(palace_path, name)
|
||||||
|
if not os.path.isdir(seg_dir):
|
||||||
|
continue
|
||||||
|
|
||||||
|
meta_path = os.path.join(seg_dir, "index_metadata.pickle")
|
||||||
|
if not os.path.isfile(meta_path):
|
||||||
|
continue
|
||||||
|
|
||||||
|
reason = None
|
||||||
|
try:
|
||||||
|
persisted = _SafePersistentDataUnpickler.load(meta_path)
|
||||||
|
except (EOFError, OSError):
|
||||||
|
logger.debug(
|
||||||
|
"Skipping invalid-HNSW quarantine for transient metadata read in %s",
|
||||||
|
meta_path,
|
||||||
|
exc_info=True,
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
except pickle.UnpicklingError as exc:
|
||||||
|
if "truncated" in str(exc).lower() or "ran out of input" in str(exc).lower():
|
||||||
|
logger.debug(
|
||||||
|
"Skipping invalid-HNSW quarantine for transient metadata read in %s",
|
||||||
|
meta_path,
|
||||||
|
exc_info=True,
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
reason = f"invalid index_metadata.pickle: {exc}"
|
||||||
|
except Exception as exc:
|
||||||
|
reason = f"invalid index_metadata.pickle: {exc}"
|
||||||
|
else:
|
||||||
|
if not isinstance(persisted, dict) and not (
|
||||||
|
hasattr(persisted, "dimensionality") or hasattr(persisted, "id_to_label")
|
||||||
|
):
|
||||||
|
reason = f"unrecognized index_metadata.pickle payload: {type(persisted).__name__}"
|
||||||
|
else:
|
||||||
|
dimensionality, id_to_label = _persisted_metadata_fields(persisted)
|
||||||
|
if id_to_label is not None and not isinstance(id_to_label, dict):
|
||||||
|
reason = f"invalid id_to_label type {type(id_to_label).__name__}"
|
||||||
|
else:
|
||||||
|
has_labels = bool(id_to_label)
|
||||||
|
if has_labels and not _valid_dimensionality(dimensionality):
|
||||||
|
reason = (
|
||||||
|
"labels present but dimensionality is missing or invalid "
|
||||||
|
f"({dimensionality!r})"
|
||||||
|
)
|
||||||
|
elif dimensionality is not None and not _valid_dimensionality(dimensionality):
|
||||||
|
reason = f"invalid dimensionality {dimensionality!r}"
|
||||||
|
|
||||||
|
if reason is None:
|
||||||
|
continue
|
||||||
|
|
||||||
|
stamp = _dt.datetime.now().strftime("%Y%m%d-%H%M%S")
|
||||||
|
target = f"{seg_dir}.corrupt-{stamp}"
|
||||||
|
try:
|
||||||
|
os.rename(seg_dir, target)
|
||||||
|
moved.append(target)
|
||||||
|
logger.warning("Quarantined invalid HNSW metadata in %s: %s", seg_dir, reason)
|
||||||
|
except OSError:
|
||||||
|
logger.exception("Failed to quarantine invalid HNSW metadata in %s", seg_dir)
|
||||||
|
|
||||||
|
return moved
|
||||||
|
|
||||||
|
|
||||||
def _fix_blob_seq_ids(palace_path: str) -> None:
|
def _fix_blob_seq_ids(palace_path: str) -> None:
|
||||||
"""Fix ChromaDB 0.6.x -> 1.5.x migration bug: BLOB seq_ids -> INTEGER.
|
"""Fix ChromaDB 0.6.x -> 1.5.x migration bug: BLOB seq_ids -> INTEGER.
|
||||||
|
|
||||||
@@ -676,11 +802,58 @@ def _as_list(v: Any) -> list:
|
|||||||
return [v]
|
return [v]
|
||||||
|
|
||||||
|
|
||||||
class ChromaCollection(BaseCollection):
|
def _close_client(client) -> None:
|
||||||
"""Thin adapter translating ChromaDB dict returns into typed results."""
|
"""Call ``PersistentClient.close()`` if available, swallow otherwise.
|
||||||
|
|
||||||
def __init__(self, collection):
|
chromadb 1.5.x exposes ``Client.close()`` to release rust-side SQLite
|
||||||
|
file locks; older versions relied on GC. Try/except keeps forward-compat.
|
||||||
|
"""
|
||||||
|
if client is None:
|
||||||
|
return
|
||||||
|
try:
|
||||||
|
client.close()
|
||||||
|
except Exception:
|
||||||
|
logger.debug("client.close() unavailable or failed", exc_info=True)
|
||||||
|
|
||||||
|
|
||||||
|
class ChromaCollection(BaseCollection):
|
||||||
|
"""Thin adapter translating ChromaDB dict returns into typed results.
|
||||||
|
|
||||||
|
When ``palace_path`` is set, all write methods (``add``, ``upsert``,
|
||||||
|
``update``, ``delete``) acquire ``mine_palace_lock(palace_path)`` for the
|
||||||
|
duration of the underlying chromadb call. This serializes MCP and other
|
||||||
|
direct-backend writers against ``mempalace mine`` and against each other,
|
||||||
|
closing the race between concurrent writers that triggers ChromaDB's
|
||||||
|
multi-threaded HNSW corruption (#974/#965).
|
||||||
|
|
||||||
|
The lock is the same primitive used by ``miner.mine()`` so re-entrant
|
||||||
|
acquisition from inside the mine pipeline (mine -> _mine_body ->
|
||||||
|
collection.upsert) is short-circuited by the per-thread guard inside
|
||||||
|
``mine_palace_lock`` — no self-deadlock.
|
||||||
|
|
||||||
|
``palace_path=None`` disables the wrapping, preserving the legacy
|
||||||
|
no-lock behaviour for callers that construct a ``ChromaCollection``
|
||||||
|
directly without going through ``ChromaBackend``.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, collection, palace_path: Optional[str] = None):
|
||||||
self._collection = collection
|
self._collection = collection
|
||||||
|
self._palace_path = palace_path
|
||||||
|
|
||||||
|
@contextlib.contextmanager
|
||||||
|
def _write_lock(self):
|
||||||
|
"""Acquire ``mine_palace_lock`` for the configured palace, if any.
|
||||||
|
|
||||||
|
No-op (yields immediately) when ``self._palace_path`` is None.
|
||||||
|
"""
|
||||||
|
if self._palace_path is None:
|
||||||
|
yield
|
||||||
|
return
|
||||||
|
# Late import — palace.py imports ChromaBackend from this module.
|
||||||
|
from ..palace import mine_palace_lock
|
||||||
|
|
||||||
|
with mine_palace_lock(self._palace_path):
|
||||||
|
yield
|
||||||
|
|
||||||
# ------------------------------------------------------------------
|
# ------------------------------------------------------------------
|
||||||
# Writes
|
# Writes
|
||||||
@@ -692,6 +865,7 @@ class ChromaCollection(BaseCollection):
|
|||||||
kwargs["metadatas"] = metadatas
|
kwargs["metadatas"] = metadatas
|
||||||
if embeddings is not None:
|
if embeddings is not None:
|
||||||
kwargs["embeddings"] = embeddings
|
kwargs["embeddings"] = embeddings
|
||||||
|
with self._write_lock():
|
||||||
self._collection.add(**kwargs)
|
self._collection.add(**kwargs)
|
||||||
|
|
||||||
def upsert(self, *, documents, ids, metadatas=None, embeddings=None):
|
def upsert(self, *, documents, ids, metadatas=None, embeddings=None):
|
||||||
@@ -700,6 +874,7 @@ class ChromaCollection(BaseCollection):
|
|||||||
kwargs["metadatas"] = metadatas
|
kwargs["metadatas"] = metadatas
|
||||||
if embeddings is not None:
|
if embeddings is not None:
|
||||||
kwargs["embeddings"] = embeddings
|
kwargs["embeddings"] = embeddings
|
||||||
|
with self._write_lock():
|
||||||
self._collection.upsert(**kwargs)
|
self._collection.upsert(**kwargs)
|
||||||
|
|
||||||
def update(
|
def update(
|
||||||
@@ -719,6 +894,7 @@ class ChromaCollection(BaseCollection):
|
|||||||
kwargs["metadatas"] = metadatas
|
kwargs["metadatas"] = metadatas
|
||||||
if embeddings is not None:
|
if embeddings is not None:
|
||||||
kwargs["embeddings"] = embeddings
|
kwargs["embeddings"] = embeddings
|
||||||
|
with self._write_lock():
|
||||||
self._collection.update(**kwargs)
|
self._collection.update(**kwargs)
|
||||||
|
|
||||||
# ------------------------------------------------------------------
|
# ------------------------------------------------------------------
|
||||||
@@ -863,6 +1039,7 @@ class ChromaCollection(BaseCollection):
|
|||||||
kwargs["ids"] = ids
|
kwargs["ids"] = ids
|
||||||
if where is not None:
|
if where is not None:
|
||||||
kwargs["where"] = where
|
kwargs["where"] = where
|
||||||
|
with self._write_lock():
|
||||||
self._collection.delete(**kwargs)
|
self._collection.delete(**kwargs)
|
||||||
|
|
||||||
def count(self):
|
def count(self):
|
||||||
@@ -977,7 +1154,7 @@ class ChromaBackend(BaseBackend):
|
|||||||
db_path = os.path.join(palace_path, "chroma.sqlite3")
|
db_path = os.path.join(palace_path, "chroma.sqlite3")
|
||||||
# DB was present when cache was built but is now missing → invalidate.
|
# DB was present when cache was built but is now missing → invalidate.
|
||||||
if cached is not None and not os.path.isfile(db_path):
|
if cached is not None and not os.path.isfile(db_path):
|
||||||
self._clients.pop(palace_path, None)
|
_close_client(self._clients.pop(palace_path, None))
|
||||||
self._freshness.pop(palace_path, None)
|
self._freshness.pop(palace_path, None)
|
||||||
cached = None
|
cached = None
|
||||||
cached_inode, cached_mtime = 0, 0.0
|
cached_inode, cached_mtime = 0, 0.0
|
||||||
@@ -993,6 +1170,13 @@ class ChromaBackend(BaseBackend):
|
|||||||
)
|
)
|
||||||
|
|
||||||
if cached is None or inode_changed or mtime_changed or mtime_appeared:
|
if cached is None or inode_changed or mtime_changed or mtime_appeared:
|
||||||
|
# An inode swap means we are reopening a different physical DB
|
||||||
|
# (post-restore, fresh palace at the same path, etc.); drop the
|
||||||
|
# per-process gate so the quarantine pre-checks run again
|
||||||
|
# against the new disk state instead of trusting cached "we
|
||||||
|
# already cleaned this path" credit from the prior inode.
|
||||||
|
if inode_changed:
|
||||||
|
ChromaBackend._quarantined_paths.discard(palace_path)
|
||||||
ChromaBackend._prepare_palace_for_open(palace_path)
|
ChromaBackend._prepare_palace_for_open(palace_path)
|
||||||
cached = chromadb.PersistentClient(path=palace_path)
|
cached = chromadb.PersistentClient(path=palace_path)
|
||||||
self._clients[palace_path] = cached
|
self._clients[palace_path] = cached
|
||||||
@@ -1006,26 +1190,27 @@ class ChromaBackend(BaseBackend):
|
|||||||
# Public static helpers (legacy; prefer :meth:`get_collection`)
|
# Public static helpers (legacy; prefer :meth:`get_collection`)
|
||||||
# ------------------------------------------------------------------
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
# Per-process record of palaces that have already had quarantine_stale_hnsw
|
# Per-process record of palaces that have already had the cold-start
|
||||||
# invoked at least once. The proactive drift check is a *cold-start*
|
# quarantine invoked at least once. The proactive HNSW checks are a
|
||||||
# protection — it catches HNSW segments that arrived stale relative to
|
# *cold-start* protection — they catch segments that arrive stale relative
|
||||||
# ``chroma.sqlite3`` (e.g. cross-machine replication, partial restore,
|
# to ``chroma.sqlite3`` or invalid on disk (e.g. cross-machine replication,
|
||||||
# crashed-mid-write). Once a long-running process has opened the palace
|
# partial restore, crashed-mid-write). Once a long-running process has
|
||||||
# cleanly, re-firing on every reconnect is a *runtime thrash*: the
|
# opened the palace cleanly, re-firing the stale check on every reconnect
|
||||||
# daemon's own writes bump sqlite mtime but HNSW flushes batch on
|
# is a *runtime thrash*: the daemon's own writes bump sqlite mtime but HNSW
|
||||||
# chromadb's internal cadence, so the mtime gap naturally exceeds the
|
# flushes batch on chromadb's internal cadence, so the mtime gap naturally
|
||||||
# threshold under steady write load even though nothing is corrupt.
|
# exceeds the threshold under steady write load even though nothing is
|
||||||
|
# corrupt.
|
||||||
# Real runtime drift is still handled — palace-daemon's ``_auto_repair``
|
# Real runtime drift is still handled — palace-daemon's ``_auto_repair``
|
||||||
# calls :func:`quarantine_stale_hnsw` directly on observed HNSW errors,
|
# calls :func:`quarantine_stale_hnsw` directly on observed HNSW errors,
|
||||||
# which bypasses this gate.
|
# which bypasses this gate.
|
||||||
#
|
#
|
||||||
# Thread-safety: this set is mutated without a lock. Two concurrent
|
# Thread-safety: this set is mutated without a lock. Two concurrent
|
||||||
# ``make_client()`` calls for the same palace can both pass the
|
# ``make_client()`` calls for the same palace can both pass the
|
||||||
# membership check and both invoke ``quarantine_stale_hnsw``. That's
|
# membership check and both invoke the cold-start quarantine. That's
|
||||||
# safe because the function is idempotent (mtime check + timestamped
|
# safe because the functions are idempotent (mtime checks + timestamped
|
||||||
# rename of distinct directories), so the worst-case race produces
|
# rename of distinct directories), so the worst-case race produces one
|
||||||
# one redundant rename attempt that no-ops. Idempotency is the
|
# redundant rename attempt that no-ops. Idempotency is the safety
|
||||||
# safety property; locking would add cost without correctness gain.
|
# property; locking would add cost without correctness gain.
|
||||||
_quarantined_paths: set[str] = set()
|
_quarantined_paths: set[str] = set()
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
@@ -1033,12 +1218,16 @@ class ChromaBackend(BaseBackend):
|
|||||||
"""Run the pre-open safety pass shared by :meth:`make_client` and
|
"""Run the pre-open safety pass shared by :meth:`make_client` and
|
||||||
:meth:`_client`.
|
:meth:`_client`.
|
||||||
|
|
||||||
Two steps, both required before constructing a ``PersistentClient``:
|
Three steps, all required before constructing a ``PersistentClient``:
|
||||||
|
|
||||||
1. ``_fix_blob_seq_ids`` — repairs the BLOB seq_id quirk that bites
|
1. ``_fix_blob_seq_ids`` — repairs the BLOB seq_id quirk that bites
|
||||||
certain chromadb migrations.
|
certain chromadb migrations.
|
||||||
2. ``quarantine_stale_hnsw`` — gated by :attr:`_quarantined_paths` so
|
2. ``quarantine_invalid_hnsw_metadata`` — renames aside any HNSW
|
||||||
it fires once per palace per process. This is the SIGSEGV
|
``index_metadata.pickle`` that fails to load, so chromadb opens
|
||||||
|
against an empty index instead of crashing on the unloadable
|
||||||
|
pickle (#1266 / PR #1285).
|
||||||
|
3. ``quarantine_stale_hnsw`` — also gated by :attr:`_quarantined_paths`
|
||||||
|
so it fires once per palace per process. This is the SIGSEGV
|
||||||
prevention path for stale HNSW segments (see #1121, #1132, #1263);
|
prevention path for stale HNSW segments (see #1121, #1132, #1263);
|
||||||
wiring it through this helper means CLI mining, search, repair,
|
wiring it through this helper means CLI mining, search, repair,
|
||||||
and status all benefit, not just the legacy ``make_client``
|
and status all benefit, not just the legacy ``make_client``
|
||||||
@@ -1050,6 +1239,7 @@ class ChromaBackend(BaseBackend):
|
|||||||
"""
|
"""
|
||||||
_fix_blob_seq_ids(palace_path)
|
_fix_blob_seq_ids(palace_path)
|
||||||
if palace_path not in ChromaBackend._quarantined_paths:
|
if palace_path not in ChromaBackend._quarantined_paths:
|
||||||
|
quarantine_invalid_hnsw_metadata(palace_path)
|
||||||
quarantine_stale_hnsw(palace_path)
|
quarantine_stale_hnsw(palace_path)
|
||||||
ChromaBackend._quarantined_paths.add(palace_path)
|
ChromaBackend._quarantined_paths.add(palace_path)
|
||||||
|
|
||||||
@@ -1061,7 +1251,7 @@ class ChromaBackend(BaseBackend):
|
|||||||
own client cache. New code should obtain a collection through
|
own client cache. New code should obtain a collection through
|
||||||
:meth:`get_collection` which manages caching internally.
|
:meth:`get_collection` which manages caching internally.
|
||||||
|
|
||||||
Quarantines stale HNSW segments **once per palace per process**. See
|
Quarantines HNSW segments **once per palace per process**. See
|
||||||
:attr:`_quarantined_paths` for the rationale (cold-start protection
|
:attr:`_quarantined_paths` for the rationale (cold-start protection
|
||||||
vs. runtime thrash on steady-write daemons).
|
vs. runtime thrash on steady-write daemons).
|
||||||
"""
|
"""
|
||||||
@@ -1131,17 +1321,25 @@ class ChromaBackend(BaseBackend):
|
|||||||
else:
|
else:
|
||||||
collection = client.get_collection(collection_name, **ef_kwargs)
|
collection = client.get_collection(collection_name, **ef_kwargs)
|
||||||
_pin_hnsw_threads(collection)
|
_pin_hnsw_threads(collection)
|
||||||
return ChromaCollection(collection)
|
return ChromaCollection(collection, palace_path=palace_path)
|
||||||
|
|
||||||
def close_palace(self, palace) -> None:
|
def close_palace(self, palace) -> None:
|
||||||
"""Drop cached handles for ``palace``. Accepts ``PalaceRef`` or legacy path str."""
|
"""Drop cached handles for ``palace`` and release its SQLite file lock.
|
||||||
|
|
||||||
|
Accepts ``PalaceRef`` or legacy path str. chromadb's rust-side file
|
||||||
|
lock is held until ``PersistentClient.close()`` is called, so plain
|
||||||
|
dict eviction would leave the palace path unreopenable and
|
||||||
|
unremovable in the same process.
|
||||||
|
"""
|
||||||
path = palace.local_path if isinstance(palace, PalaceRef) else palace
|
path = palace.local_path if isinstance(palace, PalaceRef) else palace
|
||||||
if path is None:
|
if path is None:
|
||||||
return
|
return
|
||||||
self._clients.pop(path, None)
|
_close_client(self._clients.pop(path, None))
|
||||||
self._freshness.pop(path, None)
|
self._freshness.pop(path, None)
|
||||||
|
|
||||||
def close(self) -> None:
|
def close(self) -> None:
|
||||||
|
for client in self._clients.values():
|
||||||
|
_close_client(client)
|
||||||
self._clients.clear()
|
self._clients.clear()
|
||||||
self._freshness.clear()
|
self._freshness.clear()
|
||||||
self._closed = True
|
self._closed = True
|
||||||
@@ -1182,7 +1380,7 @@ class ChromaBackend(BaseBackend):
|
|||||||
},
|
},
|
||||||
**ef_kwargs,
|
**ef_kwargs,
|
||||||
)
|
)
|
||||||
return ChromaCollection(collection)
|
return ChromaCollection(collection, palace_path=palace_path)
|
||||||
|
|
||||||
|
|
||||||
def _normalize_get_collection_args(args, kwargs):
|
def _normalize_get_collection_args(args, kwargs):
|
||||||
|
|||||||
+133
-30
@@ -654,10 +654,19 @@ def cmd_repair(args):
|
|||||||
import shutil
|
import shutil
|
||||||
from .backends.chroma import ChromaBackend
|
from .backends.chroma import ChromaBackend
|
||||||
from .migrate import confirm_destructive_action, contains_palace_database
|
from .migrate import confirm_destructive_action, contains_palace_database
|
||||||
from .repair import TruncationDetected, check_extraction_safety
|
from .repair import (
|
||||||
|
RebuildCollectionError,
|
||||||
|
TruncationDetected,
|
||||||
|
_close_chroma_handles,
|
||||||
|
_extract_drawers,
|
||||||
|
_rebuild_collection_via_temp,
|
||||||
|
check_extraction_safety,
|
||||||
|
)
|
||||||
|
|
||||||
|
config = MempalaceConfig()
|
||||||
|
collection_name = config.collection_name
|
||||||
palace_path = os.path.abspath(
|
palace_path = os.path.abspath(
|
||||||
os.path.expanduser(args.palace) if args.palace else MempalaceConfig().palace_path
|
os.path.expanduser(args.palace) if args.palace else config.palace_path
|
||||||
)
|
)
|
||||||
|
|
||||||
if getattr(args, "mode", "legacy") == "max-seq-id":
|
if getattr(args, "mode", "legacy") == "max-seq-id":
|
||||||
@@ -673,6 +682,57 @@ def cmd_repair(args):
|
|||||||
)
|
)
|
||||||
return
|
return
|
||||||
|
|
||||||
|
if getattr(args, "mode", "legacy") == "from-sqlite":
|
||||||
|
from .migrate import confirm_destructive_action
|
||||||
|
from .repair import RebuildPartialError, rebuild_from_sqlite
|
||||||
|
|
||||||
|
source_path = getattr(args, "source", None)
|
||||||
|
source_path = (
|
||||||
|
os.path.abspath(os.path.expanduser(source_path)) if source_path else palace_path
|
||||||
|
)
|
||||||
|
archive_existing = getattr(args, "archive_existing", False)
|
||||||
|
|
||||||
|
# Gate any path that touches the user's existing palace dir
|
||||||
|
# behind confirm_destructive_action. The legacy mode already
|
||||||
|
# gates; from-sqlite needs the same protection because:
|
||||||
|
# (a) --archive-existing renames the existing palace,
|
||||||
|
# (b) --source PATH writes into --palace dir which the user
|
||||||
|
# may not realize is also a palace.
|
||||||
|
# No prompt when source != dest AND dest does not exist (pure
|
||||||
|
# extract-into-fresh-dir case is non-destructive to existing
|
||||||
|
# palaces).
|
||||||
|
is_destructive_to_dest = source_path == palace_path or os.path.exists(palace_path)
|
||||||
|
if is_destructive_to_dest and not confirm_destructive_action(
|
||||||
|
"Rebuild from SQLite", palace_path, assume_yes=getattr(args, "yes", False)
|
||||||
|
):
|
||||||
|
return
|
||||||
|
|
||||||
|
try:
|
||||||
|
counts = rebuild_from_sqlite(
|
||||||
|
source_palace=source_path,
|
||||||
|
dest_palace=palace_path,
|
||||||
|
archive_existing_dest=archive_existing,
|
||||||
|
)
|
||||||
|
except RebuildPartialError as exc:
|
||||||
|
# The error itself was already printed by rebuild_from_sqlite
|
||||||
|
# with recovery instructions; surface a non-zero exit so
|
||||||
|
# scripts and CI gates see the failure.
|
||||||
|
print(
|
||||||
|
"\n Rebuild partial — see message above. "
|
||||||
|
f"Failed in collection: {exc.failed_collection}"
|
||||||
|
)
|
||||||
|
sys.exit(1)
|
||||||
|
# An empty counts dict is rebuild_from_sqlite's documented signal
|
||||||
|
# for a validation refusal (missing source, existing dest,
|
||||||
|
# in-place without --archive-existing). The library already
|
||||||
|
# printed an actionable message; exit non-zero so unattended
|
||||||
|
# scripts/CI distinguish "invalid inputs" from a successful
|
||||||
|
# rebuild that legitimately found zero rows (which still returns
|
||||||
|
# a populated dict with 0-valued counts).
|
||||||
|
if not counts:
|
||||||
|
sys.exit(1)
|
||||||
|
return
|
||||||
|
|
||||||
db_path = os.path.join(palace_path, "chroma.sqlite3")
|
db_path = os.path.join(palace_path, "chroma.sqlite3")
|
||||||
|
|
||||||
if not os.path.isdir(palace_path):
|
if not os.path.isdir(palace_path):
|
||||||
@@ -691,7 +751,7 @@ def cmd_repair(args):
|
|||||||
|
|
||||||
# Try to read existing drawers
|
# Try to read existing drawers
|
||||||
try:
|
try:
|
||||||
col = backend.get_collection(palace_path, "mempalace_drawers")
|
col = backend.get_collection(palace_path, collection_name)
|
||||||
total = col.count()
|
total = col.count()
|
||||||
print(f" Drawers found: {total}")
|
print(f" Drawers found: {total}")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
@@ -711,18 +771,7 @@ def cmd_repair(args):
|
|||||||
# Extract all drawers in batches
|
# Extract all drawers in batches
|
||||||
print("\n Extracting drawers...")
|
print("\n Extracting drawers...")
|
||||||
batch_size = 5000
|
batch_size = 5000
|
||||||
all_ids = []
|
all_ids, all_docs, all_metas = _extract_drawers(col, total, batch_size)
|
||||||
all_docs = []
|
|
||||||
all_metas = []
|
|
||||||
offset = 0
|
|
||||||
while offset < total:
|
|
||||||
batch = col.get(limit=batch_size, offset=offset, include=["documents", "metadatas"])
|
|
||||||
if not batch["ids"]:
|
|
||||||
break
|
|
||||||
all_ids.extend(batch["ids"])
|
|
||||||
all_docs.extend(batch["documents"])
|
|
||||||
all_metas.extend(batch["metadatas"])
|
|
||||||
offset += len(batch["ids"])
|
|
||||||
print(f" Extracted {len(all_ids)} drawers")
|
print(f" Extracted {len(all_ids)} drawers")
|
||||||
|
|
||||||
# ── #1208 guard ──────────────────────────────────────────────────
|
# ── #1208 guard ──────────────────────────────────────────────────
|
||||||
@@ -737,12 +786,12 @@ def cmd_repair(args):
|
|||||||
palace_path,
|
palace_path,
|
||||||
len(all_ids),
|
len(all_ids),
|
||||||
confirm_truncation_ok=getattr(args, "confirm_truncation_ok", False),
|
confirm_truncation_ok=getattr(args, "confirm_truncation_ok", False),
|
||||||
|
collection_name=collection_name,
|
||||||
)
|
)
|
||||||
except TruncationDetected as e:
|
except TruncationDetected as e:
|
||||||
print(e.message)
|
print(e.message)
|
||||||
return
|
return
|
||||||
|
|
||||||
# Backup and rebuild
|
|
||||||
palace_path = os.path.normpath(palace_path)
|
palace_path = os.path.normpath(palace_path)
|
||||||
backup_path = palace_path + ".backup"
|
backup_path = palace_path + ".backup"
|
||||||
if os.path.exists(backup_path):
|
if os.path.exists(backup_path):
|
||||||
@@ -756,18 +805,34 @@ def cmd_repair(args):
|
|||||||
print(f" Backing up to {backup_path}...")
|
print(f" Backing up to {backup_path}...")
|
||||||
shutil.copytree(palace_path, backup_path)
|
shutil.copytree(palace_path, backup_path)
|
||||||
|
|
||||||
print(" Rebuilding collection...")
|
try:
|
||||||
backend.delete_collection(palace_path, "mempalace_drawers")
|
filed = _rebuild_collection_via_temp(
|
||||||
new_col = backend.create_collection(palace_path, "mempalace_drawers")
|
backend,
|
||||||
|
palace_path,
|
||||||
filed = 0
|
all_ids,
|
||||||
for i in range(0, len(all_ids), batch_size):
|
all_docs,
|
||||||
batch_ids = all_ids[i : i + batch_size]
|
all_metas,
|
||||||
batch_docs = all_docs[i : i + batch_size]
|
batch_size,
|
||||||
batch_metas = all_metas[i : i + batch_size]
|
collection_name=collection_name,
|
||||||
new_col.add(documents=batch_docs, ids=batch_ids, metadatas=batch_metas)
|
progress=print,
|
||||||
filed += len(batch_ids)
|
)
|
||||||
print(f" Re-filed {filed}/{len(all_ids)} drawers...")
|
except RebuildCollectionError as e:
|
||||||
|
print(f" Repair failed: {e}")
|
||||||
|
if getattr(e, "live_replaced", False):
|
||||||
|
print(" Live collection was already replaced; restoring from backup...")
|
||||||
|
try:
|
||||||
|
_close_chroma_handles(palace_path, backend=backend)
|
||||||
|
if os.path.exists(palace_path):
|
||||||
|
shutil.rmtree(palace_path)
|
||||||
|
shutil.copytree(backup_path, palace_path)
|
||||||
|
print(f" Restore complete from backup: {backup_path}")
|
||||||
|
except Exception as restore_error:
|
||||||
|
print(f" Automatic restore failed: {restore_error}")
|
||||||
|
print(" Manual recovery required:")
|
||||||
|
print(f" 1. Remove or rename the broken directory: {palace_path}")
|
||||||
|
print(f" 2. Restore the backup directory to: {palace_path}")
|
||||||
|
print(f" Backup location: {backup_path}")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
print(f"\n Repair complete. {filed} drawers rebuilt.")
|
print(f"\n Repair complete. {filed} drawers rebuilt.")
|
||||||
print(f" Backup saved at {backup_path}")
|
print(f" Backup saved at {backup_path}")
|
||||||
@@ -935,7 +1000,25 @@ def cmd_compress(args):
|
|||||||
print(" (dry run -- nothing stored)")
|
print(" (dry run -- nothing stored)")
|
||||||
|
|
||||||
|
|
||||||
|
def _reconfigure_stdio_utf8_on_windows():
|
||||||
|
"""Decode stdio as UTF-8 on Windows for the primary `mempalace` CLI.
|
||||||
|
|
||||||
|
Thin wrapper around the shared helper in ``mempalace._stdio``. The CLI
|
||||||
|
overrides stdout/stderr to ``replace`` because ``mempalace search``
|
||||||
|
prints verbatim drawer text that may carry surrogate halves
|
||||||
|
round-tripped from filenames -- ``strict`` would crash mid-print and
|
||||||
|
lose the rest of the search result block. stdin keeps the default
|
||||||
|
``surrogateescape`` so a redirected non-UTF-8 file does not kill the
|
||||||
|
read on the first bad byte.
|
||||||
|
"""
|
||||||
|
from ._stdio import reconfigure_stdio_utf8_on_windows
|
||||||
|
|
||||||
|
reconfigure_stdio_utf8_on_windows(stdout_errors="replace", stderr_errors="replace")
|
||||||
|
|
||||||
|
|
||||||
def main():
|
def main():
|
||||||
|
_reconfigure_stdio_utf8_on_windows()
|
||||||
|
|
||||||
version_label = f"MemPalace {__version__}"
|
version_label = f"MemPalace {__version__}"
|
||||||
parser = argparse.ArgumentParser(
|
parser = argparse.ArgumentParser(
|
||||||
description="MemPalace — Give your AI a memory. No API key required.",
|
description="MemPalace — Give your AI a memory. No API key required.",
|
||||||
@@ -1195,11 +1278,31 @@ def main():
|
|||||||
)
|
)
|
||||||
p_repair.add_argument(
|
p_repair.add_argument(
|
||||||
"--mode",
|
"--mode",
|
||||||
choices=["legacy", "max-seq-id"],
|
choices=["legacy", "max-seq-id", "from-sqlite"],
|
||||||
default="legacy",
|
default="legacy",
|
||||||
help=(
|
help=(
|
||||||
"legacy: full-palace rebuild (default). "
|
"legacy: full-palace rebuild via the chromadb client (default). "
|
||||||
"max-seq-id: un-poison max_seq_id rows corrupted by the legacy 0.6.x shim. "
|
"max-seq-id: un-poison max_seq_id rows corrupted by the legacy 0.6.x shim. "
|
||||||
|
"from-sqlite: rebuild by reading rows directly from chroma.sqlite3, "
|
||||||
|
"bypassing the chromadb client. Use when legacy mode bails because the "
|
||||||
|
"chromadb client cannot open the collection."
|
||||||
|
),
|
||||||
|
)
|
||||||
|
p_repair.add_argument(
|
||||||
|
"--source",
|
||||||
|
default=None,
|
||||||
|
help=(
|
||||||
|
"Source palace path for --mode from-sqlite (defaults to --palace). "
|
||||||
|
"Use when extracting from an archived corrupt palace into a new location."
|
||||||
|
),
|
||||||
|
)
|
||||||
|
p_repair.add_argument(
|
||||||
|
"--archive-existing",
|
||||||
|
action="store_true",
|
||||||
|
help=(
|
||||||
|
"For --mode from-sqlite when --source equals --palace: rename the "
|
||||||
|
"existing palace to <palace>.pre-rebuild-<timestamp> before "
|
||||||
|
"rebuilding so the corrupt copy is preserved."
|
||||||
),
|
),
|
||||||
)
|
)
|
||||||
p_repair.add_argument(
|
p_repair.add_argument(
|
||||||
|
|||||||
+23
-3
@@ -40,6 +40,7 @@ import json
|
|||||||
import os
|
import os
|
||||||
import re
|
import re
|
||||||
import time
|
import time
|
||||||
|
import urllib.parse
|
||||||
import urllib.request
|
import urllib.request
|
||||||
import urllib.error
|
import urllib.error
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
@@ -101,6 +102,14 @@ class LLMConfig:
|
|||||||
self.endpoint = (endpoint or os.environ.get("LLM_ENDPOINT", "")).rstrip("/")
|
self.endpoint = (endpoint or os.environ.get("LLM_ENDPOINT", "")).rstrip("/")
|
||||||
self.key = key or os.environ.get("LLM_KEY", "")
|
self.key = key or os.environ.get("LLM_KEY", "")
|
||||||
self.model = model or os.environ.get("LLM_MODEL", "")
|
self.model = model or os.environ.get("LLM_MODEL", "")
|
||||||
|
if self.endpoint:
|
||||||
|
# Privacy-by-architecture: reject file:// and other non-HTTP schemes
|
||||||
|
# so a misconfigured endpoint cannot exfiltrate local files.
|
||||||
|
scheme = urllib.parse.urlparse(self.endpoint).scheme.lower()
|
||||||
|
if scheme not in ("http", "https"):
|
||||||
|
raise ValueError(
|
||||||
|
f"LLM_ENDPOINT must use http:// or https:// (got scheme {scheme!r})"
|
||||||
|
)
|
||||||
|
|
||||||
def missing(self) -> list:
|
def missing(self) -> list:
|
||||||
missing = []
|
missing = []
|
||||||
@@ -221,9 +230,19 @@ def regenerate_closets(
|
|||||||
print("No drawers in palace.")
|
print("No drawers in palace.")
|
||||||
return {"processed": 0}
|
return {"processed": 0}
|
||||||
|
|
||||||
all_data = drawers_col.get(limit=total, include=["documents", "metadatas"])
|
# Paginate the fetch — a single get(limit=total, ...) blows through
|
||||||
by_source = {}
|
# SQLite's SQLITE_MAX_VARIABLE_NUMBER (32766) on large palaces and
|
||||||
for doc_id, doc, meta in zip(all_data["ids"], all_data["documents"], all_data["metadatas"]):
|
# crashes inside chromadb (see #802, #850, #1073).
|
||||||
|
by_source: dict = {}
|
||||||
|
batch_size = 5000
|
||||||
|
offset = 0
|
||||||
|
while offset < total:
|
||||||
|
batch = drawers_col.get(limit=batch_size, offset=offset, include=["documents", "metadatas"])
|
||||||
|
ids = batch["ids"]
|
||||||
|
if not ids:
|
||||||
|
break
|
||||||
|
for doc_id, doc, meta in zip(ids, batch["documents"], batch["metadatas"]):
|
||||||
|
meta = meta or {}
|
||||||
source = meta.get("source_file", "unknown")
|
source = meta.get("source_file", "unknown")
|
||||||
w = meta.get("wing", "")
|
w = meta.get("wing", "")
|
||||||
if wing and w != wing:
|
if wing and w != wing:
|
||||||
@@ -232,6 +251,7 @@ def regenerate_closets(
|
|||||||
by_source[source] = {"drawer_ids": [], "content": [], "meta": meta}
|
by_source[source] = {"drawer_ids": [], "content": [], "meta": meta}
|
||||||
by_source[source]["drawer_ids"].append(doc_id)
|
by_source[source]["drawer_ids"].append(doc_id)
|
||||||
by_source[source]["content"].append(doc)
|
by_source[source]["content"].append(doc)
|
||||||
|
offset += len(ids)
|
||||||
|
|
||||||
sources = list(by_source.keys())
|
sources = list(by_source.keys())
|
||||||
if sample > 0:
|
if sample > 0:
|
||||||
|
|||||||
@@ -7,6 +7,7 @@ Priority: env vars > config file (~/.mempalace/config.json) > defaults
|
|||||||
import json
|
import json
|
||||||
import os
|
import os
|
||||||
import re
|
import re
|
||||||
|
from functools import lru_cache
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
|
|
||||||
@@ -81,6 +82,38 @@ def sanitize_kg_value(value: str, field_name: str = "value") -> str:
|
|||||||
return value
|
return value
|
||||||
|
|
||||||
|
|
||||||
|
# ISO-8601 date validator for knowledge-graph temporal parameters
|
||||||
|
# (as_of, valid_from, valid_to, ended). Parameterized queries already
|
||||||
|
# prevent SQL injection, but unvalidated date strings silently miss
|
||||||
|
# every row — callers cannot distinguish "no fact at this time" from
|
||||||
|
# "your date format was unrecognized." Require full YYYY-MM-DD: KG
|
||||||
|
# queries compare TEXT dates lexicographically, so partials like "2026"
|
||||||
|
# would re-introduce silent empty results (e.g. "2026-01-01" <= "2026"
|
||||||
|
# is False), defeating the purpose of validation.
|
||||||
|
_ISO_DATE_RE = re.compile(r"^\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])$")
|
||||||
|
|
||||||
|
|
||||||
|
def sanitize_iso_date(value, field_name: str = "date"):
|
||||||
|
"""Validate an ISO-8601 date string, accepting None or empty as-is.
|
||||||
|
|
||||||
|
Accepts only ``YYYY-MM-DD``. Raises ValueError on any other
|
||||||
|
non-empty input so the MCP layer can surface a clear error to the
|
||||||
|
caller instead of silently returning empty results. Partial dates
|
||||||
|
(``YYYY``, ``YYYY-MM``) are rejected because KG queries compare
|
||||||
|
TEXT dates lexicographically and would silently exclude valid facts.
|
||||||
|
"""
|
||||||
|
if value is None or value == "":
|
||||||
|
return value
|
||||||
|
if not isinstance(value, str):
|
||||||
|
raise ValueError(f"{field_name} must be a string")
|
||||||
|
value = value.strip()
|
||||||
|
if not _ISO_DATE_RE.match(value):
|
||||||
|
raise ValueError(
|
||||||
|
f"{field_name}={value!r} is not a valid ISO-8601 date " f"(expected YYYY-MM-DD)"
|
||||||
|
)
|
||||||
|
return value
|
||||||
|
|
||||||
|
|
||||||
def sanitize_content(value: str, max_length: int = 100_000) -> str:
|
def sanitize_content(value: str, max_length: int = 100_000) -> str:
|
||||||
"""Validate drawer/diary content length."""
|
"""Validate drawer/diary content length."""
|
||||||
if not isinstance(value, str) or not value.strip():
|
if not isinstance(value, str) or not value.strip():
|
||||||
@@ -95,6 +128,13 @@ def sanitize_content(value: str, max_length: int = 100_000) -> str:
|
|||||||
DEFAULT_PALACE_PATH = os.path.expanduser("~/.mempalace/palace")
|
DEFAULT_PALACE_PATH = os.path.expanduser("~/.mempalace/palace")
|
||||||
DEFAULT_COLLECTION_NAME = "mempalace_drawers"
|
DEFAULT_COLLECTION_NAME = "mempalace_drawers"
|
||||||
|
|
||||||
|
|
||||||
|
@lru_cache(maxsize=1)
|
||||||
|
def get_configured_collection_name() -> str:
|
||||||
|
"""Return the configured drawer collection name without repeated config-file reads."""
|
||||||
|
return MempalaceConfig().collection_name
|
||||||
|
|
||||||
|
|
||||||
DEFAULT_TOPIC_WINGS = [
|
DEFAULT_TOPIC_WINGS = [
|
||||||
"emotions",
|
"emotions",
|
||||||
"consciousness",
|
"consciousness",
|
||||||
|
|||||||
@@ -11,6 +11,7 @@ Same palace as project mining. Different ingest strategy.
|
|||||||
import os
|
import os
|
||||||
import sys
|
import sys
|
||||||
import hashlib
|
import hashlib
|
||||||
|
import logging
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
from collections import defaultdict
|
from collections import defaultdict
|
||||||
@@ -24,6 +25,8 @@ from .palace import (
|
|||||||
mine_lock,
|
mine_lock,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
logger = logging.getLogger("mempalace_mcp")
|
||||||
|
|
||||||
|
|
||||||
# Cached hall keywords — avoids re-reading config per drawer
|
# Cached hall keywords — avoids re-reading config per drawer
|
||||||
_HALL_KEYWORDS_CACHE = None
|
_HALL_KEYWORDS_CACHE = None
|
||||||
@@ -331,7 +334,7 @@ def _file_chunks_locked(collection, source_file, chunks, wing, room, agent, extr
|
|||||||
try:
|
try:
|
||||||
collection.delete(where={"source_file": source_file})
|
collection.delete(where={"source_file": source_file})
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
logger.debug("Stale-drawer purge failed for %s", source_file, exc_info=True)
|
||||||
|
|
||||||
# Batch chunks into bounded upserts so large transcripts keep most of
|
# Batch chunks into bounded upserts so large transcripts keep most of
|
||||||
# the embedding speedup without one huge Chroma/SQLite request. Keep
|
# the embedding speedup without one huge Chroma/SQLite request. Keep
|
||||||
|
|||||||
+1
-1
@@ -89,7 +89,7 @@ def dedup_source_group(col, drawer_ids, threshold=DEFAULT_THRESHOLD, dry_run=Tru
|
|||||||
kept = []
|
kept = []
|
||||||
to_delete = []
|
to_delete = []
|
||||||
|
|
||||||
for did, doc, meta in items:
|
for did, doc, _meta in items:
|
||||||
if not doc or len(doc) < 20:
|
if not doc or len(doc) < 20:
|
||||||
to_delete.append(did)
|
to_delete.append(did)
|
||||||
continue
|
continue
|
||||||
|
|||||||
@@ -873,7 +873,7 @@ class Dialect:
|
|||||||
|
|
||||||
for date_key in sorted(by_date.keys()):
|
for date_key in sorted(by_date.keys()):
|
||||||
lines.append(f"=MOMENTS[{date_key}]=")
|
lines.append(f"=MOMENTS[{date_key}]=")
|
||||||
for z, fnum in by_date[date_key]:
|
for z, _fnum in by_date[date_key]:
|
||||||
entities = []
|
entities = []
|
||||||
for p in z.get("people", []):
|
for p in z.get("people", []):
|
||||||
code = self.encode_entity(p)
|
code = self.encode_entity(p)
|
||||||
|
|||||||
@@ -16,6 +16,7 @@ Usage:
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
import json
|
import json
|
||||||
|
import os
|
||||||
import re
|
import re
|
||||||
import urllib.request
|
import urllib.request
|
||||||
import urllib.parse
|
import urllib.parse
|
||||||
@@ -320,11 +321,35 @@ class EntityRegistry:
|
|||||||
self._path.parent.chmod(0o700)
|
self._path.parent.chmod(0o700)
|
||||||
except (OSError, NotImplementedError):
|
except (OSError, NotImplementedError):
|
||||||
pass
|
pass
|
||||||
self._path.write_text(json.dumps(self._data, indent=2), encoding="utf-8")
|
# Atomic write: serialize to a sibling temp file in the same dir
|
||||||
|
# (so os.replace stays on one filesystem), fsync, then rename over
|
||||||
|
# the target. A crash mid-write leaves the previous registry intact
|
||||||
|
# instead of a half-written file or an empty file from the truncate.
|
||||||
|
payload = json.dumps(self._data, indent=2)
|
||||||
|
tmp_path = self._path.with_name(self._path.name + ".tmp")
|
||||||
|
with open(tmp_path, "w", encoding="utf-8") as f:
|
||||||
|
f.write(payload)
|
||||||
|
f.flush()
|
||||||
|
os.fsync(f.fileno())
|
||||||
try:
|
try:
|
||||||
self._path.chmod(0o600)
|
tmp_path.chmod(0o600)
|
||||||
except (OSError, NotImplementedError):
|
except (OSError, NotImplementedError):
|
||||||
pass
|
pass
|
||||||
|
os.replace(tmp_path, self._path)
|
||||||
|
# On ext4 (and similar) the rename's durability across power loss
|
||||||
|
# requires an additional fsync on the parent directory. Without it,
|
||||||
|
# the kernel can ack the rename and a crash reverts to the state
|
||||||
|
# where the temp file is present and the target is at the old version.
|
||||||
|
try:
|
||||||
|
dir_fd = os.open(str(self._path.parent), os.O_RDONLY)
|
||||||
|
try:
|
||||||
|
os.fsync(dir_fd)
|
||||||
|
finally:
|
||||||
|
os.close(dir_fd)
|
||||||
|
except OSError:
|
||||||
|
# Windows and some special filesystems reject directory fds — they
|
||||||
|
# have different durability semantics on rename anyway.
|
||||||
|
pass
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def _empty() -> dict:
|
def _empty() -> dict:
|
||||||
|
|||||||
@@ -27,6 +27,7 @@ Usage:
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
import os
|
import os
|
||||||
import re
|
import re
|
||||||
from datetime import datetime, timezone
|
from datetime import datetime, timezone
|
||||||
@@ -35,6 +36,8 @@ from datetime import datetime, timezone
|
|||||||
# ~/.mempalace/known_entities.json on every check_text call.
|
# ~/.mempalace/known_entities.json on every check_text call.
|
||||||
from .miner import _load_known_entities_raw
|
from .miner import _load_known_entities_raw
|
||||||
|
|
||||||
|
logger = logging.getLogger("mempalace_mcp")
|
||||||
|
|
||||||
|
|
||||||
# Narrow detection patterns — parse "X is Y's Z" and "X's Z is Y".
|
# Narrow detection patterns — parse "X is Y's Z" and "X's Z is Y".
|
||||||
# Names are captured greedily as word sequences (letters + optional
|
# Names are captured greedily as word sequences (letters + optional
|
||||||
@@ -214,6 +217,7 @@ def _check_kg_contradictions(text: str, palace_path: str) -> list:
|
|||||||
try:
|
try:
|
||||||
facts = kg.query_entity(subject, direction="outgoing")
|
facts = kg.query_entity(subject, direction="outgoing")
|
||||||
except Exception:
|
except Exception:
|
||||||
|
logger.debug("KG lookup failed for subject %r", subject, exc_info=True)
|
||||||
continue
|
continue
|
||||||
if not facts:
|
if not facts:
|
||||||
continue
|
continue
|
||||||
@@ -303,11 +307,27 @@ def _edit_distance(s1: str, s2: str) -> int:
|
|||||||
return prev[-1]
|
return prev[-1]
|
||||||
|
|
||||||
|
|
||||||
|
def _reconfigure_stdio_utf8_on_windows():
|
||||||
|
"""Decode --stdin payload as UTF-8 on Windows.
|
||||||
|
|
||||||
|
Thin wrapper around the shared helper in ``mempalace._stdio``. Mirrors
|
||||||
|
the primary CLI policy: stdout/stderr use ``replace`` because
|
||||||
|
extracted fact text can include surrogate halves round-tripped from
|
||||||
|
filenames -- ``strict`` would raise UnicodeEncodeError mid-print.
|
||||||
|
stdin keeps the default ``surrogateescape``.
|
||||||
|
"""
|
||||||
|
from ._stdio import reconfigure_stdio_utf8_on_windows
|
||||||
|
|
||||||
|
reconfigure_stdio_utf8_on_windows(stdout_errors="replace", stderr_errors="replace")
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
import argparse
|
import argparse
|
||||||
import json
|
import json
|
||||||
import sys
|
import sys
|
||||||
|
|
||||||
|
_reconfigure_stdio_utf8_on_windows()
|
||||||
|
|
||||||
parser = argparse.ArgumentParser(
|
parser = argparse.ArgumentParser(
|
||||||
description="Check text against known facts in the MemPalace palace.",
|
description="Check text against known facts in the MemPalace palace.",
|
||||||
epilog="Exits 0 when no issues found, 1 when one or more issues detected.",
|
epilog="Exits 0 when no issues found, 1 when one or more issues detected.",
|
||||||
|
|||||||
@@ -16,6 +16,23 @@ from pathlib import Path
|
|||||||
|
|
||||||
SAVE_INTERVAL = 15
|
SAVE_INTERVAL = 15
|
||||||
STATE_DIR = Path.home() / ".mempalace" / "hook_state"
|
STATE_DIR = Path.home() / ".mempalace" / "hook_state"
|
||||||
|
PALACE_ROOT = Path.home() / ".mempalace"
|
||||||
|
|
||||||
|
|
||||||
|
def _palace_root_exists() -> bool:
|
||||||
|
"""User-removable kill-switch.
|
||||||
|
|
||||||
|
If ~/.mempalace/ does not exist, the user has explicitly cleared it.
|
||||||
|
All hook side effects (logging, state dir creation, mining, ingestion)
|
||||||
|
must respect this and short-circuit BEFORE touching disk — including
|
||||||
|
before logging the short-circuit itself.
|
||||||
|
|
||||||
|
Uses ``is_dir()`` rather than ``exists()`` so a stray regular file at
|
||||||
|
``~/.mempalace`` (or a broken symlink) is treated as absent — otherwise
|
||||||
|
the kill-switch would be bypassed and ``STATE_DIR.mkdir()`` would later
|
||||||
|
crash on ``NotADirectoryError``.
|
||||||
|
"""
|
||||||
|
return PALACE_ROOT.is_dir()
|
||||||
|
|
||||||
|
|
||||||
def _mempalace_python() -> str:
|
def _mempalace_python() -> str:
|
||||||
@@ -142,6 +159,8 @@ _state_dir_initialized = False
|
|||||||
|
|
||||||
def _log(message: str):
|
def _log(message: str):
|
||||||
"""Append to hook state log file."""
|
"""Append to hook state log file."""
|
||||||
|
if not _palace_root_exists():
|
||||||
|
return # User removed the palace; do not recreate by logging
|
||||||
global _state_dir_initialized
|
global _state_dir_initialized
|
||||||
try:
|
try:
|
||||||
if not _state_dir_initialized:
|
if not _state_dir_initialized:
|
||||||
@@ -550,6 +569,9 @@ def _wing_from_transcript_path(transcript_path: str) -> str:
|
|||||||
|
|
||||||
def hook_stop(data: dict, harness: str):
|
def hook_stop(data: dict, harness: str):
|
||||||
"""Stop hook: block every N messages for auto-save."""
|
"""Stop hook: block every N messages for auto-save."""
|
||||||
|
if not _palace_root_exists():
|
||||||
|
_output({})
|
||||||
|
return
|
||||||
parsed = _parse_harness_input(data, harness)
|
parsed = _parse_harness_input(data, harness)
|
||||||
session_id = parsed["session_id"]
|
session_id = parsed["session_id"]
|
||||||
stop_hook_active = parsed["stop_hook_active"]
|
stop_hook_active = parsed["stop_hook_active"]
|
||||||
@@ -659,6 +681,9 @@ def hook_stop(data: dict, harness: str):
|
|||||||
|
|
||||||
def hook_session_start(data: dict, harness: str):
|
def hook_session_start(data: dict, harness: str):
|
||||||
"""Session start hook: initialize session tracking state."""
|
"""Session start hook: initialize session tracking state."""
|
||||||
|
if not _palace_root_exists():
|
||||||
|
_output({})
|
||||||
|
return
|
||||||
parsed = _parse_harness_input(data, harness)
|
parsed = _parse_harness_input(data, harness)
|
||||||
session_id = parsed["session_id"]
|
session_id = parsed["session_id"]
|
||||||
|
|
||||||
@@ -673,6 +698,9 @@ def hook_session_start(data: dict, harness: str):
|
|||||||
|
|
||||||
def hook_precompact(data: dict, harness: str):
|
def hook_precompact(data: dict, harness: str):
|
||||||
"""Precompact hook: mine transcript synchronously, then allow compaction."""
|
"""Precompact hook: mine transcript synchronously, then allow compaction."""
|
||||||
|
if not _palace_root_exists():
|
||||||
|
_output({})
|
||||||
|
return
|
||||||
parsed = _parse_harness_input(data, harness)
|
parsed = _parse_harness_input(data, harness)
|
||||||
session_id = parsed["session_id"]
|
session_id = parsed["session_id"]
|
||||||
transcript_path = parsed["transcript_path"]
|
transcript_path = parsed["transcript_path"]
|
||||||
|
|||||||
@@ -171,6 +171,15 @@ class KnowledgeGraph:
|
|||||||
add_triple("Max", "does", "swimming", valid_from="2025-01-01")
|
add_triple("Max", "does", "swimming", valid_from="2025-01-01")
|
||||||
add_triple("Alice", "worried_about", "Max injury", valid_from="2026-01", valid_to="2026-02")
|
add_triple("Alice", "worried_about", "Max injury", valid_from="2026-01", valid_to="2026-02")
|
||||||
"""
|
"""
|
||||||
|
# Reject inverted intervals: a triple with valid_to < valid_from
|
||||||
|
# would never satisfy `valid_from <= as_of AND valid_to >= as_of`,
|
||||||
|
# so it would be invisible to every query — silently corrupt.
|
||||||
|
if valid_from is not None and valid_to is not None and valid_to < valid_from:
|
||||||
|
raise ValueError(
|
||||||
|
f"valid_to={valid_to!r} is before valid_from={valid_from!r}; "
|
||||||
|
"an inverted interval would be invisible to every KG query"
|
||||||
|
)
|
||||||
|
|
||||||
sub_id = self._entity_id(subject)
|
sub_id = self._entity_id(subject)
|
||||||
obj_id = self._entity_id(obj)
|
obj_id = self._entity_id(obj)
|
||||||
pred = predicate.lower().replace(" ", "_")
|
pred = predicate.lower().replace(" ", "_")
|
||||||
|
|||||||
+6
-2
@@ -124,6 +124,8 @@ class Layer1:
|
|||||||
# Score each drawer: prefer high importance, recent filing
|
# Score each drawer: prefer high importance, recent filing
|
||||||
scored = []
|
scored = []
|
||||||
for doc, meta in zip(docs, metas):
|
for doc, meta in zip(docs, metas):
|
||||||
|
meta = meta or {}
|
||||||
|
doc = doc or ""
|
||||||
importance = 3
|
importance = 3
|
||||||
# Try multiple metadata keys that might carry weight info
|
# Try multiple metadata keys that might carry weight info
|
||||||
for key in ("importance", "emotional_weight", "weight"):
|
for key in ("importance", "emotional_weight", "weight"):
|
||||||
@@ -155,7 +157,7 @@ class Layer1:
|
|||||||
lines.append(room_line)
|
lines.append(room_line)
|
||||||
total_len += len(room_line)
|
total_len += len(room_line)
|
||||||
|
|
||||||
for imp, meta, doc in entries:
|
for _imp, meta, doc in entries:
|
||||||
source = Path(meta.get("source_file", "")).name if meta.get("source_file") else ""
|
source = Path(meta.get("source_file", "")).name if meta.get("source_file") else ""
|
||||||
|
|
||||||
# Truncate doc to keep L1 compact
|
# Truncate doc to keep L1 compact
|
||||||
@@ -222,6 +224,8 @@ class Layer2:
|
|||||||
|
|
||||||
lines = [f"## L2 — ON-DEMAND ({len(docs)} drawers)"]
|
lines = [f"## L2 — ON-DEMAND ({len(docs)} drawers)"]
|
||||||
for doc, meta in zip(docs[:n_results], metas[:n_results]):
|
for doc, meta in zip(docs[:n_results], metas[:n_results]):
|
||||||
|
meta = meta or {}
|
||||||
|
doc = doc or ""
|
||||||
room_name = meta.get("room", "?")
|
room_name = meta.get("room", "?")
|
||||||
source = Path(meta.get("source_file", "")).name if meta.get("source_file") else ""
|
source = Path(meta.get("source_file", "")).name if meta.get("source_file") else ""
|
||||||
snippet = doc.strip().replace("\n", " ")
|
snippet = doc.strip().replace("\n", " ")
|
||||||
@@ -283,7 +287,7 @@ class Layer3:
|
|||||||
for i, (doc, meta, dist) in enumerate(zip(docs, metas, dists), 1):
|
for i, (doc, meta, dist) in enumerate(zip(docs, metas, dists), 1):
|
||||||
meta = meta or {}
|
meta = meta or {}
|
||||||
doc = doc or ""
|
doc = doc or ""
|
||||||
similarity = round(1 - dist, 3)
|
similarity = round(max(0.0, 1 - dist), 3)
|
||||||
wing_name = meta.get("wing", "?")
|
wing_name = meta.get("wing", "?")
|
||||||
room_name = meta.get("room", "?")
|
room_name = meta.get("room", "?")
|
||||||
source = Path(meta.get("source_file", "")).name if meta.get("source_file") else ""
|
source = Path(meta.get("source_file", "")).name if meta.get("source_file") else ""
|
||||||
|
|||||||
+211
-34
@@ -46,6 +46,8 @@ import argparse # noqa: E402 (deferred until after stdio protection above)
|
|||||||
import json # noqa: E402
|
import json # noqa: E402
|
||||||
import logging # noqa: E402
|
import logging # noqa: E402
|
||||||
import hashlib # noqa: E402
|
import hashlib # noqa: E402
|
||||||
|
import sqlite3 # noqa: E402
|
||||||
|
import threading # noqa: E402
|
||||||
import time # noqa: E402
|
import time # noqa: E402
|
||||||
from datetime import date, datetime # noqa: E402
|
from datetime import date, datetime # noqa: E402
|
||||||
from pathlib import Path # noqa: E402
|
from pathlib import Path # noqa: E402
|
||||||
@@ -55,6 +57,7 @@ from .config import ( # noqa: E402
|
|||||||
sanitize_kg_value,
|
sanitize_kg_value,
|
||||||
sanitize_name,
|
sanitize_name,
|
||||||
sanitize_content,
|
sanitize_content,
|
||||||
|
sanitize_iso_date,
|
||||||
)
|
)
|
||||||
from .version import __version__ # noqa: E402
|
from .version import __version__ # noqa: E402
|
||||||
from chromadb.errors import NotFoundError as _ChromaNotFoundError # noqa: E402
|
from chromadb.errors import NotFoundError as _ChromaNotFoundError # noqa: E402
|
||||||
@@ -78,7 +81,7 @@ from .palace_graph import ( # noqa: E402
|
|||||||
follow_tunnels,
|
follow_tunnels,
|
||||||
)
|
)
|
||||||
|
|
||||||
from .knowledge_graph import KnowledgeGraph # noqa: E402
|
from .knowledge_graph import KnowledgeGraph, DEFAULT_KG_PATH # noqa: E402
|
||||||
|
|
||||||
logging.basicConfig(level=logging.INFO, format="%(message)s", stream=sys.stderr)
|
logging.basicConfig(level=logging.INFO, format="%(message)s", stream=sys.stderr)
|
||||||
logger = logging.getLogger("mempalace_mcp")
|
logger = logging.getLogger("mempalace_mcp")
|
||||||
@@ -103,12 +106,61 @@ if _args.palace:
|
|||||||
os.environ["MEMPALACE_PALACE_PATH"] = os.path.abspath(_args.palace)
|
os.environ["MEMPALACE_PALACE_PATH"] = os.path.abspath(_args.palace)
|
||||||
|
|
||||||
_config = MempalaceConfig()
|
_config = MempalaceConfig()
|
||||||
# Only override KG path when --palace is explicitly provided; otherwise use
|
|
||||||
# KnowledgeGraph's default (~/.mempalace/knowledge_graph.sqlite3).
|
_kg_by_path: dict[str, KnowledgeGraph] = {}
|
||||||
if _args.palace:
|
_kg_cache_lock = threading.Lock()
|
||||||
_kg = KnowledgeGraph(db_path=os.path.join(_config.palace_path, "knowledge_graph.sqlite3"))
|
_palace_flag_given: bool = bool(_args.palace)
|
||||||
else:
|
|
||||||
_kg = KnowledgeGraph()
|
|
||||||
|
def _resolve_kg_path() -> str:
|
||||||
|
if _palace_flag_given:
|
||||||
|
return os.path.join(_config.palace_path, "knowledge_graph.sqlite3")
|
||||||
|
return DEFAULT_KG_PATH
|
||||||
|
|
||||||
|
|
||||||
|
def _get_kg() -> KnowledgeGraph:
|
||||||
|
path = os.path.abspath(_resolve_kg_path())
|
||||||
|
kg = _kg_by_path.get(path)
|
||||||
|
if kg is not None:
|
||||||
|
return kg
|
||||||
|
with _kg_cache_lock:
|
||||||
|
kg = _kg_by_path.get(path)
|
||||||
|
if kg is None:
|
||||||
|
kg = KnowledgeGraph(db_path=path)
|
||||||
|
_kg_by_path[path] = kg
|
||||||
|
return kg
|
||||||
|
|
||||||
|
|
||||||
|
def _call_kg(op):
|
||||||
|
"""Run ``op(kg)`` against the cached KG with one-shot retry on close.
|
||||||
|
|
||||||
|
Race we're guarding against: a handler grabs ``kg = _get_kg()`` and is
|
||||||
|
about to call ``kg.add_triple(...)`` when ``tool_reconnect`` fires on
|
||||||
|
another thread, drains ``_kg_by_path``, and closes the underlying
|
||||||
|
sqlite3.Connection. The handler's call then raises
|
||||||
|
``sqlite3.ProgrammingError: Cannot operate on a closed database`` and
|
||||||
|
bubbles up as a -32000 to the MCP client even though the user just
|
||||||
|
asked for a reconnect.
|
||||||
|
|
||||||
|
Catch that single class of error, evict the stale entry from the
|
||||||
|
cache (only if it still points at the closed instance — another
|
||||||
|
thread may have already replaced it), and try once more with a fresh
|
||||||
|
KG. Beyond one retry give up: a second close means we're losing a
|
||||||
|
sustained race we won't win in this loop, and a hung loop is worse
|
||||||
|
than a clear failure surface.
|
||||||
|
"""
|
||||||
|
for attempt in range(2):
|
||||||
|
kg = _get_kg()
|
||||||
|
try:
|
||||||
|
return op(kg)
|
||||||
|
except sqlite3.ProgrammingError:
|
||||||
|
if attempt == 0:
|
||||||
|
path = os.path.abspath(_resolve_kg_path())
|
||||||
|
with _kg_cache_lock:
|
||||||
|
if _kg_by_path.get(path) is kg:
|
||||||
|
_kg_by_path.pop(path, None)
|
||||||
|
continue
|
||||||
|
raise
|
||||||
|
|
||||||
|
|
||||||
_client_cache = None
|
_client_cache = None
|
||||||
@@ -141,7 +193,7 @@ def _refresh_vector_disabled_flag() -> None:
|
|||||||
"""
|
"""
|
||||||
global _vector_disabled, _vector_disabled_reason, _vector_capacity_status
|
global _vector_disabled, _vector_disabled_reason, _vector_capacity_status
|
||||||
try:
|
try:
|
||||||
info = hnsw_capacity_status(_config.palace_path, "mempalace_drawers")
|
info = hnsw_capacity_status(_config.palace_path, _config.collection_name)
|
||||||
except Exception:
|
except Exception:
|
||||||
logger.debug("HNSW capacity probe raised", exc_info=True)
|
logger.debug("HNSW capacity probe raised", exc_info=True)
|
||||||
return
|
return
|
||||||
@@ -274,8 +326,19 @@ def _get_client():
|
|||||||
|
|
||||||
|
|
||||||
def _get_collection(create=False):
|
def _get_collection(create=False):
|
||||||
"""Return the ChromaDB collection, caching the client between calls."""
|
"""Return the ChromaDB collection, caching the client between calls.
|
||||||
global _collection_cache, _metadata_cache, _metadata_cache_time
|
|
||||||
|
On failure, log the exception and retry once after clearing the client
|
||||||
|
and collection caches. Tools were silently returning ``None`` when a
|
||||||
|
cached client/collection went stale — typically after the chromadb
|
||||||
|
rust bindings invalidated a handle following an out-of-band write —
|
||||||
|
leaving the LLM with no diagnostic and no recovery path. The retry
|
||||||
|
forces ``_get_client()`` to rebuild from scratch (which re-runs
|
||||||
|
``quarantine_stale_hnsw`` per #1322), so the second attempt heals the
|
||||||
|
common stale-handle / stale-HNSW case automatically.
|
||||||
|
"""
|
||||||
|
global _client_cache, _collection_cache, _metadata_cache, _metadata_cache_time
|
||||||
|
for attempt in range(2):
|
||||||
try:
|
try:
|
||||||
client = _get_client()
|
client = _get_client()
|
||||||
# ChromaDB 1.x persists the EF *identity* (its ``name()``) with the
|
# ChromaDB 1.x persists the EF *identity* (its ``name()``) with the
|
||||||
@@ -322,7 +385,7 @@ def _get_collection(create=False):
|
|||||||
**ef_kwargs,
|
**ef_kwargs,
|
||||||
)
|
)
|
||||||
_pin_hnsw_threads(raw)
|
_pin_hnsw_threads(raw)
|
||||||
_collection_cache = ChromaCollection(raw)
|
_collection_cache = ChromaCollection(raw, palace_path=_config.palace_path)
|
||||||
_metadata_cache = None
|
_metadata_cache = None
|
||||||
_metadata_cache_time = 0
|
_metadata_cache_time = 0
|
||||||
elif _collection_cache is None:
|
elif _collection_cache is None:
|
||||||
@@ -330,11 +393,26 @@ def _get_collection(create=False):
|
|||||||
ef_kwargs = {"embedding_function": ef} if ef is not None else {}
|
ef_kwargs = {"embedding_function": ef} if ef is not None else {}
|
||||||
raw = client.get_collection(_config.collection_name, **ef_kwargs)
|
raw = client.get_collection(_config.collection_name, **ef_kwargs)
|
||||||
_pin_hnsw_threads(raw)
|
_pin_hnsw_threads(raw)
|
||||||
_collection_cache = ChromaCollection(raw)
|
_collection_cache = ChromaCollection(raw, palace_path=_config.palace_path)
|
||||||
_metadata_cache = None
|
_metadata_cache = None
|
||||||
_metadata_cache_time = 0
|
_metadata_cache_time = 0
|
||||||
return _collection_cache
|
return _collection_cache
|
||||||
except Exception:
|
except Exception:
|
||||||
|
logger.exception(
|
||||||
|
"_get_collection attempt %d/2 failed (palace=%s, create=%s)",
|
||||||
|
attempt + 1,
|
||||||
|
_config.palace_path,
|
||||||
|
create,
|
||||||
|
)
|
||||||
|
if attempt == 0:
|
||||||
|
# Reset all caches so the next attempt forces _get_client()
|
||||||
|
# to rebuild the chromadb client from scratch — that path
|
||||||
|
# re-runs quarantine_stale_hnsw (#1322) and reopens the
|
||||||
|
# collection cleanly, healing the common stale-handle case.
|
||||||
|
_client_cache = None
|
||||||
|
_collection_cache = None
|
||||||
|
_metadata_cache = None
|
||||||
|
_metadata_cache_time = 0
|
||||||
return None
|
return None
|
||||||
|
|
||||||
|
|
||||||
@@ -412,6 +490,7 @@ def _tool_status_via_sqlite() -> dict:
|
|||||||
db_path = os.path.join(_config.palace_path, "chroma.sqlite3")
|
db_path = os.path.join(_config.palace_path, "chroma.sqlite3")
|
||||||
if not os.path.isfile(db_path):
|
if not os.path.isfile(db_path):
|
||||||
return _no_palace()
|
return _no_palace()
|
||||||
|
collection_name = _config.collection_name
|
||||||
|
|
||||||
wings: dict = {}
|
wings: dict = {}
|
||||||
rooms: dict = {}
|
rooms: dict = {}
|
||||||
@@ -425,8 +504,9 @@ def _tool_status_via_sqlite() -> dict:
|
|||||||
FROM embeddings e
|
FROM embeddings e
|
||||||
JOIN segments s ON e.segment_id = s.id
|
JOIN segments s ON e.segment_id = s.id
|
||||||
JOIN collections c ON s.collection = c.id
|
JOIN collections c ON s.collection = c.id
|
||||||
WHERE c.name = 'mempalace_drawers'
|
WHERE c.name = ?
|
||||||
"""
|
""",
|
||||||
|
(collection_name,),
|
||||||
).fetchone()
|
).fetchone()
|
||||||
total = int(row[0]) if row and row[0] is not None else 0
|
total = int(row[0]) if row and row[0] is not None else 0
|
||||||
for key, target in (("wing", wings), ("room", rooms)):
|
for key, target in (("wing", wings), ("room", rooms)):
|
||||||
@@ -437,12 +517,12 @@ def _tool_status_via_sqlite() -> dict:
|
|||||||
JOIN embeddings e ON em.id = e.id
|
JOIN embeddings e ON em.id = e.id
|
||||||
JOIN segments s ON e.segment_id = s.id
|
JOIN segments s ON e.segment_id = s.id
|
||||||
JOIN collections c ON s.collection = c.id
|
JOIN collections c ON s.collection = c.id
|
||||||
WHERE c.name = 'mempalace_drawers'
|
WHERE c.name = ?
|
||||||
AND em.key = ?
|
AND em.key = ?
|
||||||
AND em.string_value IS NOT NULL
|
AND em.string_value IS NOT NULL
|
||||||
GROUP BY em.string_value
|
GROUP BY em.string_value
|
||||||
""",
|
""",
|
||||||
(key,),
|
(collection_name, key),
|
||||||
):
|
):
|
||||||
target[value] = count
|
target[value] = count
|
||||||
finally:
|
finally:
|
||||||
@@ -642,6 +722,7 @@ def tool_search(
|
|||||||
n_results=limit,
|
n_results=limit,
|
||||||
max_distance=dist,
|
max_distance=dist,
|
||||||
vector_disabled=_vector_disabled,
|
vector_disabled=_vector_disabled,
|
||||||
|
collection_name=_config.collection_name,
|
||||||
)
|
)
|
||||||
if _vector_disabled:
|
if _vector_disabled:
|
||||||
result["vector_disabled"] = True
|
result["vector_disabled"] = True
|
||||||
@@ -688,10 +769,12 @@ def tool_check_duplicate(content: str, threshold: float = 0.9):
|
|||||||
if results["ids"] and results["ids"][0]:
|
if results["ids"] and results["ids"][0]:
|
||||||
for i, drawer_id in enumerate(results["ids"][0]):
|
for i, drawer_id in enumerate(results["ids"][0]):
|
||||||
dist = results["distances"][0][i]
|
dist = results["distances"][0][i]
|
||||||
similarity = round(1 - dist, 3)
|
similarity = round(max(0.0, 1 - dist), 3)
|
||||||
if similarity >= threshold:
|
if similarity >= threshold:
|
||||||
meta = results["metadatas"][0][i]
|
# Chroma 1.5.x can return None for partially-flushed rows;
|
||||||
doc = results["documents"][0][i]
|
# coerce to empty sentinels so downstream .get() is safe.
|
||||||
|
meta = results["metadatas"][0][i] or {}
|
||||||
|
doc = results["documents"][0][i] or ""
|
||||||
duplicates.append(
|
duplicates.append(
|
||||||
{
|
{
|
||||||
"id": drawer_id,
|
"id": drawer_id,
|
||||||
@@ -842,11 +925,11 @@ def tool_add_drawer(
|
|||||||
|
|
||||||
# Idempotency: if the deterministic ID already exists, return success as a no-op.
|
# Idempotency: if the deterministic ID already exists, return success as a no-op.
|
||||||
try:
|
try:
|
||||||
existing = col.get(ids=[drawer_id])
|
existing = col.get(ids=[drawer_id], include=[])
|
||||||
if existing and existing["ids"]:
|
if existing.ids:
|
||||||
return {"success": True, "reason": "already_exists", "drawer_id": drawer_id}
|
return {"success": True, "reason": "already_exists", "drawer_id": drawer_id}
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
logger.debug("Idempotency pre-check failed for %s", drawer_id, exc_info=True)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
col.upsert(
|
col.upsert(
|
||||||
@@ -863,6 +946,12 @@ def tool_add_drawer(
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
|
inserted = col.get(ids=[drawer_id], include=[])
|
||||||
|
if not inserted.ids:
|
||||||
|
raise RuntimeError(
|
||||||
|
"Drawer write was acknowledged but the new ID is not readable. "
|
||||||
|
"The palace index may be stale; run reconnect or repair."
|
||||||
|
)
|
||||||
_metadata_cache = None
|
_metadata_cache = None
|
||||||
logger.info(f"Filed drawer: {drawer_id} → {wing}/{room}")
|
logger.info(f"Filed drawer: {drawer_id} → {wing}/{room}")
|
||||||
return {"success": True, "drawer_id": drawer_id, "wing": wing, "room": room}
|
return {"success": True, "drawer_id": drawer_id, "wing": wing, "room": room}
|
||||||
@@ -961,6 +1050,13 @@ def tool_list_drawers(wing: str = None, room: str = None, limit: int = 20, offse
|
|||||||
kwargs["where"] = where
|
kwargs["where"] = where
|
||||||
result = col.get(**kwargs)
|
result = col.get(**kwargs)
|
||||||
|
|
||||||
|
# Compute total matching drawers for pagination.
|
||||||
|
if where:
|
||||||
|
total_result = col.get(where=where, include=[])
|
||||||
|
total = len(total_result["ids"])
|
||||||
|
else:
|
||||||
|
total = col.count()
|
||||||
|
|
||||||
drawers = []
|
drawers = []
|
||||||
for i, did in enumerate(result["ids"]):
|
for i, did in enumerate(result["ids"]):
|
||||||
meta = result["metadatas"][i]
|
meta = result["metadatas"][i]
|
||||||
@@ -975,6 +1071,7 @@ def tool_list_drawers(wing: str = None, room: str = None, limit: int = 20, offse
|
|||||||
)
|
)
|
||||||
return {
|
return {
|
||||||
"drawers": drawers,
|
"drawers": drawers,
|
||||||
|
"total": total,
|
||||||
"count": len(drawers),
|
"count": len(drawers),
|
||||||
"offset": offset,
|
"offset": offset,
|
||||||
"limit": limit,
|
"limit": limit,
|
||||||
@@ -1059,11 +1156,12 @@ def tool_kg_query(entity: str, as_of: str = None, direction: str = "both"):
|
|||||||
"""Query the knowledge graph for an entity's relationships."""
|
"""Query the knowledge graph for an entity's relationships."""
|
||||||
try:
|
try:
|
||||||
entity = sanitize_kg_value(entity, "entity")
|
entity = sanitize_kg_value(entity, "entity")
|
||||||
|
as_of = sanitize_iso_date(as_of, "as_of")
|
||||||
except ValueError as e:
|
except ValueError as e:
|
||||||
return {"error": str(e)}
|
return {"error": str(e)}
|
||||||
if direction not in ("outgoing", "incoming", "both"):
|
if direction not in ("outgoing", "incoming", "both"):
|
||||||
return {"error": "direction must be 'outgoing', 'incoming', or 'both'"}
|
return {"error": "direction must be 'outgoing', 'incoming', or 'both'"}
|
||||||
results = _kg.query_entity(entity, as_of=as_of, direction=direction)
|
results = _call_kg(lambda kg: kg.query_entity(entity, as_of=as_of, direction=direction))
|
||||||
return {"entity": entity, "as_of": as_of, "facts": results, "count": len(results)}
|
return {"entity": entity, "as_of": as_of, "facts": results, "count": len(results)}
|
||||||
|
|
||||||
|
|
||||||
@@ -1092,6 +1190,7 @@ def tool_kg_add(
|
|||||||
subject = sanitize_kg_value(subject, "subject")
|
subject = sanitize_kg_value(subject, "subject")
|
||||||
predicate = sanitize_name(predicate, "predicate")
|
predicate = sanitize_name(predicate, "predicate")
|
||||||
object = sanitize_kg_value(object, "object")
|
object = sanitize_kg_value(object, "object")
|
||||||
|
valid_from = sanitize_iso_date(valid_from, "valid_from")
|
||||||
except ValueError as e:
|
except ValueError as e:
|
||||||
return {"success": False, "error": str(e)}
|
return {"success": False, "error": str(e)}
|
||||||
|
|
||||||
@@ -1108,7 +1207,8 @@ def tool_kg_add(
|
|||||||
"source_drawer_id": source_drawer_id,
|
"source_drawer_id": source_drawer_id,
|
||||||
},
|
},
|
||||||
)
|
)
|
||||||
triple_id = _kg.add_triple(
|
triple_id = _call_kg(
|
||||||
|
lambda kg: kg.add_triple(
|
||||||
subject,
|
subject,
|
||||||
predicate,
|
predicate,
|
||||||
object,
|
object,
|
||||||
@@ -1118,6 +1218,7 @@ def tool_kg_add(
|
|||||||
source_file=source_file,
|
source_file=source_file,
|
||||||
source_drawer_id=source_drawer_id,
|
source_drawer_id=source_drawer_id,
|
||||||
)
|
)
|
||||||
|
)
|
||||||
return {"success": True, "triple_id": triple_id, "fact": f"{subject} → {predicate} → {object}"}
|
return {"success": True, "triple_id": triple_id, "fact": f"{subject} → {predicate} → {object}"}
|
||||||
|
|
||||||
|
|
||||||
@@ -1135,6 +1236,7 @@ def tool_kg_invalidate(subject: str, predicate: str, object: str, ended: str = N
|
|||||||
subject = sanitize_kg_value(subject, "subject")
|
subject = sanitize_kg_value(subject, "subject")
|
||||||
predicate = sanitize_name(predicate, "predicate")
|
predicate = sanitize_name(predicate, "predicate")
|
||||||
object = sanitize_kg_value(object, "object")
|
object = sanitize_kg_value(object, "object")
|
||||||
|
ended = sanitize_iso_date(ended, "ended")
|
||||||
except ValueError as e:
|
except ValueError as e:
|
||||||
return {"success": False, "error": str(e)}
|
return {"success": False, "error": str(e)}
|
||||||
resolved_ended = ended or date.today().isoformat()
|
resolved_ended = ended or date.today().isoformat()
|
||||||
@@ -1147,7 +1249,7 @@ def tool_kg_invalidate(subject: str, predicate: str, object: str, ended: str = N
|
|||||||
"ended": resolved_ended,
|
"ended": resolved_ended,
|
||||||
},
|
},
|
||||||
)
|
)
|
||||||
_kg.invalidate(subject, predicate, object, ended=resolved_ended)
|
_call_kg(lambda kg: kg.invalidate(subject, predicate, object, ended=resolved_ended))
|
||||||
return {
|
return {
|
||||||
"success": True,
|
"success": True,
|
||||||
"fact": f"{subject} → {predicate} → {object}",
|
"fact": f"{subject} → {predicate} → {object}",
|
||||||
@@ -1162,13 +1264,13 @@ def tool_kg_timeline(entity: str = None):
|
|||||||
entity = sanitize_kg_value(entity, "entity")
|
entity = sanitize_kg_value(entity, "entity")
|
||||||
except ValueError as e:
|
except ValueError as e:
|
||||||
return {"error": str(e)}
|
return {"error": str(e)}
|
||||||
results = _kg.timeline(entity)
|
results = _call_kg(lambda kg: kg.timeline(entity))
|
||||||
return {"entity": entity or "all", "timeline": results, "count": len(results)}
|
return {"entity": entity or "all", "timeline": results, "count": len(results)}
|
||||||
|
|
||||||
|
|
||||||
def tool_kg_stats():
|
def tool_kg_stats():
|
||||||
"""Knowledge graph overview: entities, triples, relationship types."""
|
"""Knowledge graph overview: entities, triples, relationship types."""
|
||||||
return _kg.stats()
|
return _call_kg(lambda kg: kg.stats())
|
||||||
|
|
||||||
|
|
||||||
# ==================== AGENT DIARY ====================
|
# ==================== AGENT DIARY ====================
|
||||||
@@ -1351,7 +1453,7 @@ def tool_hook_settings(silent_save: bool = None, desktop_toast: bool = None):
|
|||||||
try:
|
try:
|
||||||
config = MempalaceConfig()
|
config = MempalaceConfig()
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
logger.debug("Could not re-read config after update", exc_info=True)
|
||||||
|
|
||||||
result = {
|
result = {
|
||||||
"success": True,
|
"success": True,
|
||||||
@@ -1400,10 +1502,11 @@ def tool_memories_filed_away():
|
|||||||
|
|
||||||
|
|
||||||
def tool_reconnect():
|
def tool_reconnect():
|
||||||
"""Force the MCP server to drop the cached ChromaDB collection and reconnect.
|
"""Force the MCP server to drop cached ChromaDB + KnowledgeGraph state.
|
||||||
|
|
||||||
Use after external scripts or CLI commands modify the palace database
|
Use after external scripts or CLI commands modify the palace database
|
||||||
directly, which can leave the in-memory HNSW index stale.
|
or replace ``knowledge_graph.sqlite3`` directly, which can leave the
|
||||||
|
in-memory HNSW index stale or pin a closed-on-disk SQLite connection.
|
||||||
"""
|
"""
|
||||||
global \
|
global \
|
||||||
_client_cache, \
|
_client_cache, \
|
||||||
@@ -1412,6 +1515,30 @@ def tool_reconnect():
|
|||||||
_palace_db_mtime, \
|
_palace_db_mtime, \
|
||||||
_vector_disabled, \
|
_vector_disabled, \
|
||||||
_vector_disabled_reason
|
_vector_disabled_reason
|
||||||
|
from . import palace as palace_module
|
||||||
|
|
||||||
|
close_errors = []
|
||||||
|
try:
|
||||||
|
palace_module._DEFAULT_BACKEND.close_palace(_config.palace_path)
|
||||||
|
except Exception as exc:
|
||||||
|
logger.debug("Failed to close shared palace backend during reconnect", exc_info=True)
|
||||||
|
close_errors.append(f"backend close_palace failed: {exc}")
|
||||||
|
try:
|
||||||
|
from chromadb.api.client import SharedSystemClient
|
||||||
|
|
||||||
|
clear_system_cache = getattr(SharedSystemClient, "clear_system_cache", None)
|
||||||
|
if callable(clear_system_cache):
|
||||||
|
clear_system_cache()
|
||||||
|
else:
|
||||||
|
logger.debug(
|
||||||
|
"SharedSystemClient.clear_system_cache is unavailable; skipping shared Chroma cache clear during reconnect"
|
||||||
|
)
|
||||||
|
except Exception as exc:
|
||||||
|
logger.debug(
|
||||||
|
"Failed to clear Chroma shared system cache during reconnect",
|
||||||
|
exc_info=True,
|
||||||
|
)
|
||||||
|
close_errors.append(f"shared Chroma cache clear failed: {exc}")
|
||||||
_client_cache = None
|
_client_cache = None
|
||||||
_collection_cache = None
|
_collection_cache = None
|
||||||
_palace_db_inode = 0
|
_palace_db_inode = 0
|
||||||
@@ -1421,15 +1548,36 @@ def tool_reconnect():
|
|||||||
# still applies after the reconnect.
|
# still applies after the reconnect.
|
||||||
_vector_disabled = False
|
_vector_disabled = False
|
||||||
_vector_disabled_reason = ""
|
_vector_disabled_reason = ""
|
||||||
|
# Drain the per-path KnowledgeGraph cache so a replaced sqlite file is
|
||||||
|
# reopened on the next tool call rather than served from a stale handle.
|
||||||
|
with _kg_cache_lock:
|
||||||
|
for kg in _kg_by_path.values():
|
||||||
|
try:
|
||||||
|
kg.close()
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
_kg_by_path.clear()
|
||||||
try:
|
try:
|
||||||
col = _get_collection()
|
col = _get_collection()
|
||||||
if col is None:
|
if col is None:
|
||||||
return {
|
result = {
|
||||||
"success": False,
|
"success": False,
|
||||||
"message": "No palace found after reconnect",
|
"message": "No palace found after reconnect",
|
||||||
"drawers": 0,
|
"drawers": 0,
|
||||||
"vector_disabled": _vector_disabled,
|
"vector_disabled": _vector_disabled,
|
||||||
}
|
}
|
||||||
|
if close_errors:
|
||||||
|
result["error"] = "; ".join(close_errors)
|
||||||
|
return result
|
||||||
|
if close_errors:
|
||||||
|
return {
|
||||||
|
"success": False,
|
||||||
|
"message": "Reconnect reopened the palace but failed to fully reset cached handles",
|
||||||
|
"drawers": col.count(),
|
||||||
|
"vector_disabled": _vector_disabled,
|
||||||
|
"vector_disabled_reason": _vector_disabled_reason,
|
||||||
|
"error": "; ".join(close_errors),
|
||||||
|
}
|
||||||
return {
|
return {
|
||||||
"success": True,
|
"success": True,
|
||||||
"message": "Reconnected to palace",
|
"message": "Reconnected to palace",
|
||||||
@@ -1750,7 +1898,7 @@ TOOLS = {
|
|||||||
"handler": tool_get_drawer,
|
"handler": tool_get_drawer,
|
||||||
},
|
},
|
||||||
"mempalace_list_drawers": {
|
"mempalace_list_drawers": {
|
||||||
"description": "List drawers with pagination. Optional wing/room filter. Returns IDs, wings, rooms, and content previews.",
|
"description": "List drawers with pagination. Optional wing/room filter. Returns IDs, wings, rooms, content previews, and total matching count for pagination.",
|
||||||
"input_schema": {
|
"input_schema": {
|
||||||
"type": "object",
|
"type": "object",
|
||||||
"properties": {
|
"properties": {
|
||||||
@@ -1891,6 +2039,12 @@ SUPPORTED_PROTOCOL_VERSIONS = [
|
|||||||
|
|
||||||
|
|
||||||
def handle_request(request):
|
def handle_request(request):
|
||||||
|
if not isinstance(request, dict):
|
||||||
|
return {
|
||||||
|
"jsonrpc": "2.0",
|
||||||
|
"id": None,
|
||||||
|
"error": {"code": -32600, "message": "Invalid Request"},
|
||||||
|
}
|
||||||
method = request.get("method") or ""
|
method = request.get("method") or ""
|
||||||
params = request.get("params") or {}
|
params = request.get("params") or {}
|
||||||
req_id = request.get("id")
|
req_id = request.get("id")
|
||||||
@@ -1928,6 +2082,15 @@ def handle_request(request):
|
|||||||
},
|
},
|
||||||
}
|
}
|
||||||
elif method == "tools/call":
|
elif method == "tools/call":
|
||||||
|
if not isinstance(params, dict) or "name" not in params:
|
||||||
|
return {
|
||||||
|
"jsonrpc": "2.0",
|
||||||
|
"id": req_id,
|
||||||
|
"error": {
|
||||||
|
"code": -32602,
|
||||||
|
"message": "Invalid params: 'name' is required for tools/call",
|
||||||
|
},
|
||||||
|
}
|
||||||
tool_name = params.get("name")
|
tool_name = params.get("name")
|
||||||
tool_args = params.get("arguments") or {}
|
tool_args = params.get("arguments") or {}
|
||||||
if tool_name not in TOOLS:
|
if tool_name not in TOOLS:
|
||||||
@@ -1976,7 +2139,11 @@ def handle_request(request):
|
|||||||
return {
|
return {
|
||||||
"jsonrpc": "2.0",
|
"jsonrpc": "2.0",
|
||||||
"id": req_id,
|
"id": req_id,
|
||||||
"result": {"content": [{"type": "text", "text": json.dumps(result, indent=2)}]},
|
"result": {
|
||||||
|
"content": [
|
||||||
|
{"type": "text", "text": json.dumps(result, indent=2, ensure_ascii=False)}
|
||||||
|
]
|
||||||
|
},
|
||||||
}
|
}
|
||||||
except Exception:
|
except Exception:
|
||||||
logger.exception(f"Tool error in {tool_name}")
|
logger.exception(f"Tool error in {tool_name}")
|
||||||
@@ -2011,6 +2178,16 @@ def _restore_stdout():
|
|||||||
|
|
||||||
def main():
|
def main():
|
||||||
_restore_stdout()
|
_restore_stdout()
|
||||||
|
# Force UTF-8 on stdio. MCP JSON-RPC is UTF-8, but Python on Windows
|
||||||
|
# defaults stdin/stdout to the system codepage (e.g. cp1251), which
|
||||||
|
# corrupts non-ASCII payloads and surfaces as generic -32000 errors on
|
||||||
|
# Cyrillic/CJK content. See PEP 540.
|
||||||
|
for stream in (sys.stdin, sys.stdout):
|
||||||
|
if hasattr(stream, "reconfigure"):
|
||||||
|
try:
|
||||||
|
stream.reconfigure(encoding="utf-8", errors="replace")
|
||||||
|
except (AttributeError, OSError):
|
||||||
|
pass
|
||||||
logger.info("MemPalace MCP Server starting...")
|
logger.info("MemPalace MCP Server starting...")
|
||||||
# Pre-flight: probe HNSW capacity before any tool call so the warning
|
# Pre-flight: probe HNSW capacity before any tool call so the warning
|
||||||
# is visible at startup rather than on first use (#1222). Pure
|
# is visible at startup rather than on first use (#1222). Pure
|
||||||
@@ -2027,7 +2204,7 @@ def main():
|
|||||||
request = json.loads(line)
|
request = json.loads(line)
|
||||||
response = handle_request(request)
|
response = handle_request(request)
|
||||||
if response is not None:
|
if response is not None:
|
||||||
sys.stdout.write(json.dumps(response) + "\n")
|
sys.stdout.write(json.dumps(response, ensure_ascii=False) + "\n")
|
||||||
sys.stdout.flush()
|
sys.stdout.flush()
|
||||||
except KeyboardInterrupt:
|
except KeyboardInterrupt:
|
||||||
break
|
break
|
||||||
|
|||||||
+63
-2
@@ -22,6 +22,7 @@ import errno
|
|||||||
import os
|
import os
|
||||||
import shutil
|
import shutil
|
||||||
import sqlite3
|
import sqlite3
|
||||||
|
import uuid
|
||||||
from collections import defaultdict
|
from collections import defaultdict
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
|
|
||||||
@@ -155,6 +156,55 @@ def confirm_destructive_action(
|
|||||||
return True
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
def _result_ids(result) -> list:
|
||||||
|
"""Return ids from either the backend typed result or raw Chroma dict."""
|
||||||
|
|
||||||
|
if isinstance(result, dict):
|
||||||
|
return list(result.get("ids") or [])
|
||||||
|
|
||||||
|
return list(getattr(result, "ids", []) or [])
|
||||||
|
|
||||||
|
|
||||||
|
def collection_write_roundtrip_works(col) -> bool:
|
||||||
|
"""Return True only if the collection can upsert, read, and delete.
|
||||||
|
|
||||||
|
Some ChromaDB 0.6.x -> 1.5.x migrated collections remain readable while
|
||||||
|
writes and deletes silently no-op. A plain ``count()`` probe misses that
|
||||||
|
failure mode, so migrate must verify an actual write round-trip before
|
||||||
|
deciding that no rebuild is needed.
|
||||||
|
"""
|
||||||
|
|
||||||
|
probe_id = f"_mempalace_migrate_probe_{uuid.uuid4().hex}"
|
||||||
|
probe_doc = "mempalace migrate write round-trip probe"
|
||||||
|
probe_meta = {
|
||||||
|
"wing": "_mempalace_probe",
|
||||||
|
"room": "_mempalace_probe",
|
||||||
|
"source_file": "mempalace_migrate_probe",
|
||||||
|
"chunk_index": 0,
|
||||||
|
}
|
||||||
|
|
||||||
|
try:
|
||||||
|
col.upsert(
|
||||||
|
ids=[probe_id],
|
||||||
|
documents=[probe_doc],
|
||||||
|
metadatas=[probe_meta],
|
||||||
|
)
|
||||||
|
|
||||||
|
after_upsert = col.get(ids=[probe_id], include=[])
|
||||||
|
if probe_id not in _result_ids(after_upsert):
|
||||||
|
return False
|
||||||
|
|
||||||
|
col.delete(ids=[probe_id])
|
||||||
|
|
||||||
|
after_delete = col.get(ids=[probe_id], include=[])
|
||||||
|
if probe_id in _result_ids(after_delete):
|
||||||
|
return False
|
||||||
|
|
||||||
|
return True
|
||||||
|
except Exception:
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
def migrate(palace_path: str, dry_run: bool = False, confirm: bool = False):
|
def migrate(palace_path: str, dry_run: bool = False, confirm: bool = False):
|
||||||
"""Migrate a palace to the currently installed ChromaDB version."""
|
"""Migrate a palace to the currently installed ChromaDB version."""
|
||||||
from .backends.chroma import ChromaBackend
|
from .backends.chroma import ChromaBackend
|
||||||
@@ -179,13 +229,24 @@ def migrate(palace_path: str, dry_run: bool = False, confirm: bool = False):
|
|||||||
print(f" Source: ChromaDB {source_version}")
|
print(f" Source: ChromaDB {source_version}")
|
||||||
print(f" Target: ChromaDB {target_version}")
|
print(f" Target: ChromaDB {target_version}")
|
||||||
|
|
||||||
# Try reading with current chromadb first
|
# Try reading and writing with current chromadb first.
|
||||||
|
#
|
||||||
|
# A plain count() is not enough: some 0.6.x -> 1.5.x migrated collections
|
||||||
|
# are readable but silently drop upsert/delete operations. In that state,
|
||||||
|
# migrate must rebuild from SQLite instead of returning "No migration needed."
|
||||||
try:
|
try:
|
||||||
col = ChromaBackend().get_collection(palace_path, "mempalace_drawers")
|
col = ChromaBackend().get_collection(palace_path, "mempalace_drawers")
|
||||||
count = col.count()
|
count = col.count()
|
||||||
print(f"\n Palace is already readable by chromadb {target_version}.")
|
|
||||||
|
if collection_write_roundtrip_works(col):
|
||||||
|
print(f"\n Palace is already readable and writable by chromadb {target_version}.")
|
||||||
print(f" {count} drawers found. No migration needed.")
|
print(f" {count} drawers found. No migration needed.")
|
||||||
return True
|
return True
|
||||||
|
|
||||||
|
print(
|
||||||
|
f"\n Palace is readable by chromadb {target_version}, but write/delete verification failed."
|
||||||
|
)
|
||||||
|
print(" Rebuilding from SQLite to restore native write/delete behavior...")
|
||||||
except Exception:
|
except Exception:
|
||||||
print(f"\n Palace is NOT readable by chromadb {target_version}.")
|
print(f"\n Palace is NOT readable by chromadb {target_version}.")
|
||||||
print(" Extracting from SQLite directly...")
|
print(" Extracting from SQLite directly...")
|
||||||
|
|||||||
+38
-1
@@ -12,6 +12,7 @@ import sys
|
|||||||
import shlex
|
import shlex
|
||||||
import hashlib
|
import hashlib
|
||||||
import fnmatch
|
import fnmatch
|
||||||
|
import logging
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
from collections import defaultdict
|
from collections import defaultdict
|
||||||
@@ -31,6 +32,8 @@ from .palace import (
|
|||||||
upsert_closet_lines,
|
upsert_closet_lines,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
logger = logging.getLogger("mempalace_mcp")
|
||||||
|
|
||||||
READABLE_EXTENSIONS = {
|
READABLE_EXTENSIONS = {
|
||||||
".txt",
|
".txt",
|
||||||
".md",
|
".md",
|
||||||
@@ -63,6 +66,8 @@ SKIP_FILENAMES = {
|
|||||||
"mempal.yml",
|
"mempal.yml",
|
||||||
".gitignore",
|
".gitignore",
|
||||||
"package-lock.json",
|
"package-lock.json",
|
||||||
|
"pnpm-lock.yaml",
|
||||||
|
"yarn.lock",
|
||||||
}
|
}
|
||||||
|
|
||||||
CHUNK_SIZE = 800 # chars per drawer
|
CHUNK_SIZE = 800 # chars per drawer
|
||||||
@@ -70,6 +75,13 @@ CHUNK_OVERLAP = 100 # overlap between chunks
|
|||||||
MIN_CHUNK_SIZE = 50 # skip tiny chunks
|
MIN_CHUNK_SIZE = 50 # skip tiny chunks
|
||||||
DRAWER_UPSERT_BATCH_SIZE = 1000
|
DRAWER_UPSERT_BATCH_SIZE = 1000
|
||||||
MAX_FILE_SIZE = 500 * 1024 * 1024 # 500 MB — skip files larger than this.
|
MAX_FILE_SIZE = 500 * 1024 * 1024 # 500 MB — skip files larger than this.
|
||||||
|
# A single file producing more chunks than this is almost always a generated
|
||||||
|
# artifact (CSV/JSON dump, lockfile not in SKIP_FILENAMES, etc.). Embedding
|
||||||
|
# thousands of chunks from one file in one batch has triggered ONNX runtime
|
||||||
|
# `bad allocation` errors on Windows (#1296). The cap is conservative: a
|
||||||
|
# 500-chunk file at CHUNK_SIZE=800 is ~400 KB of source, which covers most
|
||||||
|
# legitimate hand-written content while bounding the worst-case batch.
|
||||||
|
MAX_CHUNKS_PER_FILE = 500
|
||||||
# Long Claude Code sessions and large transcript exports routinely exceed
|
# Long Claude Code sessions and large transcript exports routinely exceed
|
||||||
# 10 MB. The cap exists as a defensive rail against pathological binary
|
# 10 MB. The cap exists as a defensive rail against pathological binary
|
||||||
# files, not as a limit on legitimate text. Per-drawer size is bounded
|
# files, not as a limit on legitimate text. Per-drawer size is bounded
|
||||||
@@ -822,6 +834,13 @@ def process_file(
|
|||||||
room = detect_room(filepath, content, rooms, project_path)
|
room = detect_room(filepath, content, rooms, project_path)
|
||||||
chunks = chunk_text(content, source_file)
|
chunks = chunk_text(content, source_file)
|
||||||
|
|
||||||
|
if len(chunks) > MAX_CHUNKS_PER_FILE:
|
||||||
|
print(
|
||||||
|
f" ! [skip] {filepath.name[:50]:50} produced {len(chunks)} chunks "
|
||||||
|
f"(> {MAX_CHUNKS_PER_FILE}); add to SKIP_FILENAMES or .gitignore"
|
||||||
|
)
|
||||||
|
return 0, room
|
||||||
|
|
||||||
if dry_run:
|
if dry_run:
|
||||||
print(f" [DRY RUN] {filepath.name} -> room:{room} ({len(chunks)} drawers)")
|
print(f" [DRY RUN] {filepath.name} -> room:{room} ({len(chunks)} drawers)")
|
||||||
return len(chunks), room
|
return len(chunks), room
|
||||||
@@ -842,7 +861,7 @@ def process_file(
|
|||||||
try:
|
try:
|
||||||
collection.delete(where={"source_file": source_file})
|
collection.delete(where={"source_file": source_file})
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
logger.debug("Stale-drawer purge failed for %s", source_file, exc_info=True)
|
||||||
|
|
||||||
# Batch chunks into bounded upserts so the embedding model sees many
|
# Batch chunks into bounded upserts so the embedding model sees many
|
||||||
# chunks per forward pass without building one huge Chroma/SQLite
|
# chunks per forward pass without building one huge Chroma/SQLite
|
||||||
@@ -1164,6 +1183,24 @@ def _mine_impl(
|
|||||||
"already-filed drawers are\n upserted idempotently and will not duplicate.\n"
|
"already-filed drawers are\n upserted idempotently and will not duplicate.\n"
|
||||||
)
|
)
|
||||||
sys.exit(130)
|
sys.exit(130)
|
||||||
|
except Exception as exc:
|
||||||
|
# Without this, an arbitrary exception (ONNX bad_alloc, chromadb HNSW
|
||||||
|
# error, OS fault) propagates and the process exits with no completion
|
||||||
|
# banner — the operator sees only the final progress line and assumes
|
||||||
|
# the mine succeeded (#1296). Print the partial-progress summary the
|
||||||
|
# way we do for KeyboardInterrupt, then re-raise so the original
|
||||||
|
# traceback still surfaces and the exit code is non-zero.
|
||||||
|
print("\n\n Mine aborted by exception.")
|
||||||
|
print(f" files_processed: {files_processed}/{len(files)}")
|
||||||
|
print(f" drawers_filed: {total_drawers}")
|
||||||
|
print(f" last_file: {last_file or '<none>'}")
|
||||||
|
print(f" error: {type(exc).__name__}: {exc}")
|
||||||
|
print(
|
||||||
|
f"\n Re-run `mempalace mine {shlex.quote(project_dir)}` after addressing "
|
||||||
|
"the cause — already-filed\n drawers are upserted idempotently and will "
|
||||||
|
"not duplicate.\n"
|
||||||
|
)
|
||||||
|
raise
|
||||||
finally:
|
finally:
|
||||||
# Clean up the hooks-side PID lock if it points at us. Stale
|
# Clean up the hooks-side PID lock if it points at us. Stale
|
||||||
# entries already pass _pid_alive() == False on POSIX, but
|
# entries already pass _pid_alive() == False on POSIX, but
|
||||||
|
|||||||
@@ -118,14 +118,14 @@ def normalize(filepath: str) -> str:
|
|||||||
try:
|
try:
|
||||||
file_size = os.path.getsize(filepath)
|
file_size = os.path.getsize(filepath)
|
||||||
except OSError as e:
|
except OSError as e:
|
||||||
raise IOError(f"Could not read {filepath}: {e}")
|
raise IOError(f"Could not read {filepath}: {e}") from e
|
||||||
if file_size > 500 * 1024 * 1024: # 500 MB safety limit
|
if file_size > 500 * 1024 * 1024: # 500 MB safety limit
|
||||||
raise IOError(f"File too large ({file_size // (1024 * 1024)} MB): {filepath}")
|
raise IOError(f"File too large ({file_size // (1024 * 1024)} MB): {filepath}")
|
||||||
try:
|
try:
|
||||||
with open(filepath, "r", encoding="utf-8", errors="replace") as f:
|
with open(filepath, "r", encoding="utf-8", errors="replace") as f:
|
||||||
content = f.read()
|
content = f.read()
|
||||||
except OSError as e:
|
except OSError as e:
|
||||||
raise IOError(f"Could not read {filepath}: {e}")
|
raise IOError(f"Could not read {filepath}: {e}") from e
|
||||||
|
|
||||||
if not content.strip():
|
if not content.strip():
|
||||||
return content
|
return content
|
||||||
|
|||||||
+68
-3
@@ -6,11 +6,16 @@ Consolidates collection access patterns used by both miners and the MCP server.
|
|||||||
|
|
||||||
import contextlib
|
import contextlib
|
||||||
import hashlib
|
import hashlib
|
||||||
|
import logging
|
||||||
import os
|
import os
|
||||||
import re
|
import re
|
||||||
|
import threading
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
from .backends.chroma import ChromaBackend
|
from .backends.chroma import ChromaBackend
|
||||||
|
|
||||||
|
logger = logging.getLogger("mempalace_mcp")
|
||||||
|
|
||||||
SKIP_DIRS = {
|
SKIP_DIRS = {
|
||||||
".git",
|
".git",
|
||||||
"node_modules",
|
"node_modules",
|
||||||
@@ -52,10 +57,14 @@ NORMALIZE_VERSION = 2
|
|||||||
|
|
||||||
def get_collection(
|
def get_collection(
|
||||||
palace_path: str,
|
palace_path: str,
|
||||||
collection_name: str = "mempalace_drawers",
|
collection_name: Optional[str] = None,
|
||||||
create: bool = True,
|
create: bool = True,
|
||||||
):
|
):
|
||||||
"""Get the palace collection through the backend layer."""
|
"""Get the palace collection through the backend layer."""
|
||||||
|
if collection_name is None:
|
||||||
|
from .config import get_configured_collection_name
|
||||||
|
|
||||||
|
collection_name = get_configured_collection_name()
|
||||||
return _DEFAULT_BACKEND.get_collection(
|
return _DEFAULT_BACKEND.get_collection(
|
||||||
palace_path,
|
palace_path,
|
||||||
collection_name=collection_name,
|
collection_name=collection_name,
|
||||||
@@ -228,7 +237,7 @@ def purge_file_closets(closets_col, source_file: str) -> None:
|
|||||||
try:
|
try:
|
||||||
closets_col.delete(where={"source_file": source_file})
|
closets_col.delete(where={"source_file": source_file})
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
logger.debug("Closet purge failed for %s", source_file, exc_info=True)
|
||||||
|
|
||||||
|
|
||||||
def upsert_closet_lines(closets_col, closet_id_base, lines, metadata):
|
def upsert_closet_lines(closets_col, closet_id_base, lines, metadata):
|
||||||
@@ -306,7 +315,7 @@ def mine_lock(source_file: str):
|
|||||||
|
|
||||||
fcntl.flock(lf, fcntl.LOCK_UN)
|
fcntl.flock(lf, fcntl.LOCK_UN)
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
logger.debug("Mine-lock release failed", exc_info=True)
|
||||||
lf.close()
|
lf.close()
|
||||||
|
|
||||||
|
|
||||||
@@ -314,6 +323,47 @@ class MineAlreadyRunning(RuntimeError):
|
|||||||
"""Raised when another `mempalace mine` already holds the per-palace lock."""
|
"""Raised when another `mempalace mine` already holds the per-palace lock."""
|
||||||
|
|
||||||
|
|
||||||
|
# Per-thread record of palaces this thread already holds the lock for. Used by
|
||||||
|
# `mine_palace_lock` to short-circuit re-entrant acquisition from the same
|
||||||
|
# thread (e.g. miner.mine() acquires the outer lock then calls
|
||||||
|
# ChromaCollection.upsert which now also tries to acquire). Without this guard
|
||||||
|
# the inner call would block on its own outer flock (Linux fcntl locks are per
|
||||||
|
# open file description, so a same-thread second open of the lock file is a
|
||||||
|
# distinct lock and self-deadlocks).
|
||||||
|
#
|
||||||
|
# The holder set is tagged with ``pid`` so that a forked child does NOT
|
||||||
|
# inherit re-entrant credit from its parent: the OS-level flock IS NOT
|
||||||
|
# inherited as a "we hold it" semantically — the child must reacquire — but
|
||||||
|
# Python's ``threading.local`` IS inherited across fork. The pid check
|
||||||
|
# clears stale state so a forked child correctly hits the fcntl path.
|
||||||
|
_palace_lock_holders = threading.local()
|
||||||
|
|
||||||
|
|
||||||
|
def _holder_state():
|
||||||
|
"""Return the per-thread (pid, keys) record, refreshing after fork."""
|
||||||
|
keys = getattr(_palace_lock_holders, "keys", None)
|
||||||
|
pid = getattr(_palace_lock_holders, "pid", None)
|
||||||
|
current_pid = os.getpid()
|
||||||
|
if keys is None or pid != current_pid:
|
||||||
|
keys = set()
|
||||||
|
_palace_lock_holders.keys = keys
|
||||||
|
_palace_lock_holders.pid = current_pid
|
||||||
|
return keys
|
||||||
|
|
||||||
|
|
||||||
|
def _held_by_this_thread(lock_key: str) -> bool:
|
||||||
|
"""Return True if this thread already holds ``mine_palace_lock`` for ``lock_key``."""
|
||||||
|
return lock_key in _holder_state()
|
||||||
|
|
||||||
|
|
||||||
|
def _mark_held(lock_key: str) -> None:
|
||||||
|
_holder_state().add(lock_key)
|
||||||
|
|
||||||
|
|
||||||
|
def _mark_released(lock_key: str) -> None:
|
||||||
|
_holder_state().discard(lock_key)
|
||||||
|
|
||||||
|
|
||||||
@contextlib.contextmanager
|
@contextlib.contextmanager
|
||||||
def mine_palace_lock(palace_path: str):
|
def mine_palace_lock(palace_path: str):
|
||||||
"""Per-palace non-blocking lock around the full `mine` pipeline.
|
"""Per-palace non-blocking lock around the full `mine` pipeline.
|
||||||
@@ -338,6 +388,12 @@ def mine_palace_lock(palace_path: str):
|
|||||||
Non-blocking: if another `mine` is already writing to this palace,
|
Non-blocking: if another `mine` is already writing to this palace,
|
||||||
raise MineAlreadyRunning so the caller can exit cleanly instead of
|
raise MineAlreadyRunning so the caller can exit cleanly instead of
|
||||||
piling up as a waiting worker.
|
piling up as a waiting worker.
|
||||||
|
|
||||||
|
Re-entrant: if the current thread already holds the lock for the same
|
||||||
|
palace, the context manager passes through without re-acquiring. This
|
||||||
|
lets ChromaCollection write methods (which acquire the lock themselves
|
||||||
|
to protect MCP/direct callers) compose with miner.mine() (which holds
|
||||||
|
the outer lock for the entire mine pipeline) without self-deadlock.
|
||||||
"""
|
"""
|
||||||
lock_dir = os.path.join(os.path.expanduser("~"), ".mempalace", "locks")
|
lock_dir = os.path.join(os.path.expanduser("~"), ".mempalace", "locks")
|
||||||
os.makedirs(lock_dir, exist_ok=True)
|
os.makedirs(lock_dir, exist_ok=True)
|
||||||
@@ -346,6 +402,11 @@ def mine_palace_lock(palace_path: str):
|
|||||||
palace_key = hashlib.sha256(lock_key_source.encode()).hexdigest()[:16]
|
palace_key = hashlib.sha256(lock_key_source.encode()).hexdigest()[:16]
|
||||||
lock_path = os.path.join(lock_dir, f"mine_palace_{palace_key}.lock")
|
lock_path = os.path.join(lock_dir, f"mine_palace_{palace_key}.lock")
|
||||||
|
|
||||||
|
if _held_by_this_thread(palace_key):
|
||||||
|
# Same thread already holds the lock for this palace — pass through.
|
||||||
|
yield
|
||||||
|
return
|
||||||
|
|
||||||
lf = open(lock_path, "w")
|
lf = open(lock_path, "w")
|
||||||
acquired = False
|
acquired = False
|
||||||
try:
|
try:
|
||||||
@@ -369,7 +430,11 @@ def mine_palace_lock(palace_path: str):
|
|||||||
raise MineAlreadyRunning(
|
raise MineAlreadyRunning(
|
||||||
f"another `mempalace mine` is already running against {resolved}"
|
f"another `mempalace mine` is already running against {resolved}"
|
||||||
) from exc
|
) from exc
|
||||||
|
_mark_held(palace_key)
|
||||||
|
try:
|
||||||
yield
|
yield
|
||||||
|
finally:
|
||||||
|
_mark_released(palace_key)
|
||||||
finally:
|
finally:
|
||||||
if acquired:
|
if acquired:
|
||||||
try:
|
try:
|
||||||
|
|||||||
@@ -575,7 +575,7 @@ def follow_tunnels(wing: str, room: str, col=None, config=None):
|
|||||||
if did and did in drawer_map:
|
if did and did in drawer_map:
|
||||||
c["drawer_preview"] = drawer_map[did][:300]
|
c["drawer_preview"] = drawer_map[did][:300]
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
logger.debug("Drawer preview hydration failed", exc_info=True)
|
||||||
|
|
||||||
return connections
|
return connections
|
||||||
|
|
||||||
|
|||||||
+588
-47
@@ -34,13 +34,58 @@ import os
|
|||||||
import shutil
|
import shutil
|
||||||
import sqlite3
|
import sqlite3
|
||||||
import time
|
import time
|
||||||
|
from collections import defaultdict
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
from typing import Optional
|
from typing import Iterator, Optional
|
||||||
|
|
||||||
|
from chromadb.errors import NotFoundError as ChromaNotFoundError
|
||||||
|
|
||||||
from .backends.chroma import ChromaBackend, hnsw_capacity_status
|
from .backends.chroma import ChromaBackend, hnsw_capacity_status
|
||||||
|
|
||||||
|
|
||||||
COLLECTION_NAME = "mempalace_drawers"
|
COLLECTION_NAME = "mempalace_drawers"
|
||||||
|
REPAIR_TEMP_COLLECTION = f"{COLLECTION_NAME}__repair_tmp"
|
||||||
|
|
||||||
|
# The closets collection (AAAK index layer) is intentionally fixed —
|
||||||
|
# closets reference drawer IDs by string and live alongside drawers in the
|
||||||
|
# same palace; renaming the closets collection per-deployment would break
|
||||||
|
# cross-palace AAAK lookups. Drawer collection name comes from config
|
||||||
|
# (see ``_recoverable_collections``).
|
||||||
|
CLOSETS_COLLECTION_NAME = "mempalace_closets"
|
||||||
|
|
||||||
|
|
||||||
|
def _drawers_collection_name() -> str:
|
||||||
|
"""Resolve the drawers collection name from user config, falling back
|
||||||
|
to the module default ``COLLECTION_NAME`` if config is unreadable.
|
||||||
|
|
||||||
|
Recovery flows must honor ``MempalaceConfig().collection_name`` so a
|
||||||
|
user with a non-default drawer collection (e.g. multi-palace setups)
|
||||||
|
rebuilds the right rows. Closets remain fixed — see
|
||||||
|
``CLOSETS_COLLECTION_NAME``.
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
from .config import MempalaceConfig
|
||||||
|
|
||||||
|
return MempalaceConfig().collection_name or COLLECTION_NAME
|
||||||
|
except Exception:
|
||||||
|
return COLLECTION_NAME
|
||||||
|
|
||||||
|
|
||||||
|
def _recoverable_collections() -> tuple[str, ...]:
|
||||||
|
"""Collections rebuilt by ``rebuild_from_sqlite``, in upsert order.
|
||||||
|
|
||||||
|
Drawers first (bulk data), then closets (AAAK index layer that
|
||||||
|
references drawer IDs by string in their documents — no
|
||||||
|
foreign-key validation, so ordering is informational, not
|
||||||
|
load-bearing).
|
||||||
|
"""
|
||||||
|
return (_drawers_collection_name(), CLOSETS_COLLECTION_NAME)
|
||||||
|
|
||||||
|
|
||||||
|
# Back-compat alias for callers that imported the constant. New code
|
||||||
|
# should call ``_recoverable_collections()`` so config changes are picked
|
||||||
|
# up at call time.
|
||||||
|
RECOVERABLE_COLLECTIONS = (COLLECTION_NAME, CLOSETS_COLLECTION_NAME)
|
||||||
|
|
||||||
|
|
||||||
def _get_palace_path():
|
def _get_palace_path():
|
||||||
@@ -83,7 +128,111 @@ def _paginate_ids(col, where=None):
|
|||||||
return ids
|
return ids
|
||||||
|
|
||||||
|
|
||||||
def scan_palace(palace_path=None, only_wing=None):
|
def _extract_drawers(col, total: int, batch_size: int):
|
||||||
|
all_ids = []
|
||||||
|
all_docs = []
|
||||||
|
all_metas = []
|
||||||
|
offset = 0
|
||||||
|
while offset < total:
|
||||||
|
batch = col.get(limit=batch_size, offset=offset, include=["documents", "metadatas"])
|
||||||
|
if not batch["ids"]:
|
||||||
|
break
|
||||||
|
all_ids.extend(batch["ids"])
|
||||||
|
all_docs.extend(batch["documents"])
|
||||||
|
all_metas.extend(batch["metadatas"])
|
||||||
|
offset += len(batch["ids"])
|
||||||
|
return all_ids, all_docs, all_metas
|
||||||
|
|
||||||
|
|
||||||
|
def _verify_collection_count(col, expected: int, label: str) -> None:
|
||||||
|
actual = col.count()
|
||||||
|
if actual != expected:
|
||||||
|
raise RuntimeError(f"{label} count mismatch: expected {expected}, got {actual}")
|
||||||
|
|
||||||
|
|
||||||
|
def _is_missing_collection_value_error(exc: ValueError) -> bool:
|
||||||
|
message = str(exc).lower()
|
||||||
|
return "does not exist" in message or "not found" in message
|
||||||
|
|
||||||
|
|
||||||
|
def _delete_collection_if_exists(backend, palace_path: str, collection_name: str) -> None:
|
||||||
|
try:
|
||||||
|
backend.delete_collection(palace_path, collection_name)
|
||||||
|
except ValueError as exc:
|
||||||
|
if _is_missing_collection_value_error(exc):
|
||||||
|
return
|
||||||
|
raise
|
||||||
|
except (FileNotFoundError, ChromaNotFoundError):
|
||||||
|
return
|
||||||
|
|
||||||
|
|
||||||
|
class RebuildCollectionError(RuntimeError):
|
||||||
|
"""Raised when temp rebuild fails, carrying whether the live swap happened."""
|
||||||
|
|
||||||
|
def __init__(self, message: str, *, live_replaced: bool):
|
||||||
|
super().__init__(message)
|
||||||
|
self.live_replaced = live_replaced
|
||||||
|
|
||||||
|
|
||||||
|
def _rebuild_collection_via_temp(
|
||||||
|
backend,
|
||||||
|
palace_path: str,
|
||||||
|
all_ids,
|
||||||
|
all_docs,
|
||||||
|
all_metas,
|
||||||
|
batch_size: int,
|
||||||
|
collection_name: Optional[str] = None,
|
||||||
|
progress=print,
|
||||||
|
) -> int:
|
||||||
|
expected = len(all_ids)
|
||||||
|
collection_name = collection_name or _drawers_collection_name()
|
||||||
|
temp_name = f"{collection_name}__repair_tmp"
|
||||||
|
live_replaced = False
|
||||||
|
|
||||||
|
try:
|
||||||
|
_delete_collection_if_exists(backend, palace_path, temp_name)
|
||||||
|
|
||||||
|
progress(f" Building temporary collection: {temp_name}")
|
||||||
|
temp_col = backend.create_collection(palace_path, temp_name)
|
||||||
|
staged = 0
|
||||||
|
for i in range(0, expected, batch_size):
|
||||||
|
batch_ids = all_ids[i : i + batch_size]
|
||||||
|
batch_docs = all_docs[i : i + batch_size]
|
||||||
|
batch_metas = all_metas[i : i + batch_size]
|
||||||
|
temp_col.upsert(documents=batch_docs, ids=batch_ids, metadatas=batch_metas)
|
||||||
|
staged += len(batch_ids)
|
||||||
|
progress(f" Staged {staged}/{expected} drawers...")
|
||||||
|
_verify_collection_count(temp_col, expected, "temporary rebuild")
|
||||||
|
|
||||||
|
progress(" Rebuilding live collection...")
|
||||||
|
backend.delete_collection(palace_path, collection_name)
|
||||||
|
live_replaced = True
|
||||||
|
new_col = backend.create_collection(palace_path, collection_name)
|
||||||
|
|
||||||
|
rebuilt = 0
|
||||||
|
for i in range(0, expected, batch_size):
|
||||||
|
batch_ids = all_ids[i : i + batch_size]
|
||||||
|
batch_docs = all_docs[i : i + batch_size]
|
||||||
|
batch_metas = all_metas[i : i + batch_size]
|
||||||
|
new_col.upsert(documents=batch_docs, ids=batch_ids, metadatas=batch_metas)
|
||||||
|
rebuilt += len(batch_ids)
|
||||||
|
progress(f" Re-filed {rebuilt}/{expected} drawers...")
|
||||||
|
_verify_collection_count(new_col, expected, "rebuilt live collection")
|
||||||
|
|
||||||
|
try:
|
||||||
|
_delete_collection_if_exists(backend, palace_path, temp_name)
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
return rebuilt
|
||||||
|
except Exception as exc:
|
||||||
|
try:
|
||||||
|
_delete_collection_if_exists(backend, palace_path, temp_name)
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
raise RebuildCollectionError(str(exc), live_replaced=live_replaced) from exc
|
||||||
|
|
||||||
|
|
||||||
|
def scan_palace(palace_path=None, only_wing=None, collection_name: Optional[str] = None):
|
||||||
"""Scan the palace for corrupt/unfetchable IDs.
|
"""Scan the palace for corrupt/unfetchable IDs.
|
||||||
|
|
||||||
Probes in batches of 100, falls back to per-ID on failure.
|
Probes in batches of 100, falls back to per-ID on failure.
|
||||||
@@ -92,14 +241,15 @@ def scan_palace(palace_path=None, only_wing=None):
|
|||||||
Returns (good_set, bad_set).
|
Returns (good_set, bad_set).
|
||||||
"""
|
"""
|
||||||
palace_path = palace_path or _get_palace_path()
|
palace_path = palace_path or _get_palace_path()
|
||||||
|
collection_name = collection_name or _drawers_collection_name()
|
||||||
print(f"\n Palace: {palace_path}")
|
print(f"\n Palace: {palace_path}")
|
||||||
print(" Loading...")
|
print(" Loading...")
|
||||||
|
|
||||||
col = ChromaBackend().get_collection(palace_path, COLLECTION_NAME)
|
col = ChromaBackend().get_collection(palace_path, collection_name)
|
||||||
|
|
||||||
where = {"wing": only_wing} if only_wing else None
|
where = {"wing": only_wing} if only_wing else None
|
||||||
total = col.count()
|
total = col.count()
|
||||||
print(f" Collection: {COLLECTION_NAME}, total: {total:,}")
|
print(f" Collection: {collection_name}, total: {total:,}")
|
||||||
if only_wing:
|
if only_wing:
|
||||||
print(f" Scanning wing: {only_wing}")
|
print(f" Scanning wing: {only_wing}")
|
||||||
|
|
||||||
@@ -160,9 +310,10 @@ def scan_palace(palace_path=None, only_wing=None):
|
|||||||
return good_set, bad_set
|
return good_set, bad_set
|
||||||
|
|
||||||
|
|
||||||
def prune_corrupt(palace_path=None, confirm=False):
|
def prune_corrupt(palace_path=None, confirm=False, collection_name: Optional[str] = None):
|
||||||
"""Delete corrupt IDs listed in corrupt_ids.txt."""
|
"""Delete corrupt IDs listed in corrupt_ids.txt."""
|
||||||
palace_path = palace_path or _get_palace_path()
|
palace_path = palace_path or _get_palace_path()
|
||||||
|
collection_name = collection_name or _drawers_collection_name()
|
||||||
bad_file = os.path.join(palace_path, "corrupt_ids.txt")
|
bad_file = os.path.join(palace_path, "corrupt_ids.txt")
|
||||||
|
|
||||||
if not os.path.exists(bad_file):
|
if not os.path.exists(bad_file):
|
||||||
@@ -178,7 +329,7 @@ def prune_corrupt(palace_path=None, confirm=False):
|
|||||||
print(" Re-run with --confirm to actually delete.")
|
print(" Re-run with --confirm to actually delete.")
|
||||||
return
|
return
|
||||||
|
|
||||||
col = ChromaBackend().get_collection(palace_path, COLLECTION_NAME)
|
col = ChromaBackend().get_collection(palace_path, collection_name)
|
||||||
before = col.count()
|
before = col.count()
|
||||||
print(f" Collection size before: {before:,}")
|
print(f" Collection size before: {before:,}")
|
||||||
|
|
||||||
@@ -232,7 +383,10 @@ class TruncationDetected(Exception):
|
|||||||
|
|
||||||
|
|
||||||
def check_extraction_safety(
|
def check_extraction_safety(
|
||||||
palace_path: str, extracted: int, confirm_truncation_ok: bool = False
|
palace_path: str,
|
||||||
|
extracted: int,
|
||||||
|
confirm_truncation_ok: bool = False,
|
||||||
|
collection_name: Optional[str] = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
"""Cross-check that ``extracted`` matches the SQLite ground truth.
|
"""Cross-check that ``extracted`` matches the SQLite ground truth.
|
||||||
|
|
||||||
@@ -254,7 +408,8 @@ def check_extraction_safety(
|
|||||||
if confirm_truncation_ok:
|
if confirm_truncation_ok:
|
||||||
return
|
return
|
||||||
|
|
||||||
sqlite_count = sqlite_drawer_count(palace_path)
|
collection_name = collection_name or _drawers_collection_name()
|
||||||
|
sqlite_count = sqlite_drawer_count(palace_path, collection_name)
|
||||||
cap_signal = extracted == CHROMADB_DEFAULT_GET_LIMIT
|
cap_signal = extracted == CHROMADB_DEFAULT_GET_LIMIT
|
||||||
|
|
||||||
if sqlite_count is not None and sqlite_count > extracted:
|
if sqlite_count is not None and sqlite_count > extracted:
|
||||||
@@ -290,7 +445,7 @@ def check_extraction_safety(
|
|||||||
raise TruncationDetected(message, sqlite_count, extracted)
|
raise TruncationDetected(message, sqlite_count, extracted)
|
||||||
|
|
||||||
|
|
||||||
def sqlite_drawer_count(palace_path: str) -> "int | None":
|
def sqlite_drawer_count(palace_path: str, collection_name: Optional[str] = None) -> "int | None":
|
||||||
"""Count rows in ``chroma.sqlite3.embeddings`` for the drawers collection.
|
"""Count rows in ``chroma.sqlite3.embeddings`` for the drawers collection.
|
||||||
|
|
||||||
Used as an independent ground-truth check against the chromadb
|
Used as an independent ground-truth check against the chromadb
|
||||||
@@ -302,6 +457,7 @@ def sqlite_drawer_count(palace_path: str) -> "int | None":
|
|||||||
drift, missing tables, locked file). Callers treat ``None`` as
|
drift, missing tables, locked file). Callers treat ``None`` as
|
||||||
"unknown" and fall back to the cap-detection check.
|
"unknown" and fall back to the cap-detection check.
|
||||||
"""
|
"""
|
||||||
|
collection_name = collection_name or _drawers_collection_name()
|
||||||
sqlite_path = os.path.join(palace_path, "chroma.sqlite3")
|
sqlite_path = os.path.join(palace_path, "chroma.sqlite3")
|
||||||
if not os.path.exists(sqlite_path):
|
if not os.path.exists(sqlite_path):
|
||||||
return None
|
return None
|
||||||
@@ -318,7 +474,7 @@ def sqlite_drawer_count(palace_path: str) -> "int | None":
|
|||||||
JOIN collections c ON s.collection = c.id
|
JOIN collections c ON s.collection = c.id
|
||||||
WHERE c.name = ?
|
WHERE c.name = ?
|
||||||
""",
|
""",
|
||||||
(COLLECTION_NAME,),
|
(collection_name,),
|
||||||
).fetchone()
|
).fetchone()
|
||||||
return int(row[0]) if row and row[0] is not None else None
|
return int(row[0]) if row and row[0] is not None else None
|
||||||
finally:
|
finally:
|
||||||
@@ -395,7 +551,11 @@ def print_sqlite_integrity_abort(palace_path: str, errors: list[str]) -> None:
|
|||||||
print(" 6. Re-run `mempalace repair --yes`.")
|
print(" 6. Re-run `mempalace repair --yes`.")
|
||||||
|
|
||||||
|
|
||||||
def rebuild_index(palace_path=None, confirm_truncation_ok: bool = False):
|
def rebuild_index(
|
||||||
|
palace_path=None,
|
||||||
|
confirm_truncation_ok: bool = False,
|
||||||
|
collection_name: Optional[str] = None,
|
||||||
|
):
|
||||||
"""Rebuild the HNSW index from scratch.
|
"""Rebuild the HNSW index from scratch.
|
||||||
|
|
||||||
1. Extract all drawers via ChromaDB get()
|
1. Extract all drawers via ChromaDB get()
|
||||||
@@ -410,6 +570,7 @@ def rebuild_index(palace_path=None, confirm_truncation_ok: bool = False):
|
|||||||
(typically only a concern for palaces sized at exactly 10 000 rows).
|
(typically only a concern for palaces sized at exactly 10 000 rows).
|
||||||
"""
|
"""
|
||||||
palace_path = palace_path or _get_palace_path()
|
palace_path = palace_path or _get_palace_path()
|
||||||
|
collection_name = collection_name or _drawers_collection_name()
|
||||||
|
|
||||||
if not os.path.isdir(palace_path):
|
if not os.path.isdir(palace_path):
|
||||||
print(f"\n No palace found at {palace_path}")
|
print(f"\n No palace found at {palace_path}")
|
||||||
@@ -422,7 +583,7 @@ def rebuild_index(palace_path=None, confirm_truncation_ok: bool = False):
|
|||||||
|
|
||||||
backend = ChromaBackend()
|
backend = ChromaBackend()
|
||||||
try:
|
try:
|
||||||
col = backend.get_collection(palace_path, COLLECTION_NAME)
|
col = backend.get_collection(palace_path, collection_name)
|
||||||
total = col.count()
|
total = col.count()
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
print(f" Error reading palace: {e}")
|
print(f" Error reading palace: {e}")
|
||||||
@@ -438,18 +599,7 @@ def rebuild_index(palace_path=None, confirm_truncation_ok: bool = False):
|
|||||||
# Extract all drawers in batches
|
# Extract all drawers in batches
|
||||||
print("\n Extracting drawers...")
|
print("\n Extracting drawers...")
|
||||||
batch_size = 5000
|
batch_size = 5000
|
||||||
all_ids = []
|
all_ids, all_docs, all_metas = _extract_drawers(col, total, batch_size)
|
||||||
all_docs = []
|
|
||||||
all_metas = []
|
|
||||||
offset = 0
|
|
||||||
while offset < total:
|
|
||||||
batch = col.get(limit=batch_size, offset=offset, include=["documents", "metadatas"])
|
|
||||||
if not batch["ids"]:
|
|
||||||
break
|
|
||||||
all_ids.extend(batch["ids"])
|
|
||||||
all_docs.extend(batch["documents"])
|
|
||||||
all_metas.extend(batch["metadatas"])
|
|
||||||
offset += len(batch["ids"])
|
|
||||||
print(f" Extracted {len(all_ids)} drawers")
|
print(f" Extracted {len(all_ids)} drawers")
|
||||||
|
|
||||||
# ── #1208 guard ──────────────────────────────────────────────────
|
# ── #1208 guard ──────────────────────────────────────────────────
|
||||||
@@ -457,7 +607,12 @@ def rebuild_index(palace_path=None, confirm_truncation_ok: bool = False):
|
|||||||
# short of the SQLite ground truth (or when extraction == chromadb
|
# short of the SQLite ground truth (or when extraction == chromadb
|
||||||
# default get() cap and the SQLite check couldn't run).
|
# default get() cap and the SQLite check couldn't run).
|
||||||
try:
|
try:
|
||||||
check_extraction_safety(palace_path, len(all_ids), confirm_truncation_ok)
|
check_extraction_safety(
|
||||||
|
palace_path,
|
||||||
|
len(all_ids),
|
||||||
|
confirm_truncation_ok,
|
||||||
|
collection_name=collection_name,
|
||||||
|
)
|
||||||
except TruncationDetected as e:
|
except TruncationDetected as e:
|
||||||
print(e.message)
|
print(e.message)
|
||||||
return
|
return
|
||||||
@@ -477,28 +632,34 @@ def rebuild_index(palace_path=None, confirm_truncation_ok: bool = False):
|
|||||||
|
|
||||||
# Rebuild with correct HNSW settings
|
# Rebuild with correct HNSW settings
|
||||||
print(" Rebuilding collection with hnsw:space=cosine...")
|
print(" Rebuilding collection with hnsw:space=cosine...")
|
||||||
backend.delete_collection(palace_path, COLLECTION_NAME)
|
|
||||||
new_col = backend.create_collection(palace_path, COLLECTION_NAME)
|
|
||||||
|
|
||||||
filed = 0
|
|
||||||
try:
|
try:
|
||||||
for i in range(0, len(all_ids), batch_size):
|
filed = _rebuild_collection_via_temp(
|
||||||
batch_ids = all_ids[i : i + batch_size]
|
backend,
|
||||||
batch_docs = all_docs[i : i + batch_size]
|
palace_path,
|
||||||
batch_metas = all_metas[i : i + batch_size]
|
all_ids,
|
||||||
new_col.upsert(documents=batch_docs, ids=batch_ids, metadatas=batch_metas)
|
all_docs,
|
||||||
filed += len(batch_ids)
|
all_metas,
|
||||||
print(f" Re-filed {filed}/{len(all_ids)} drawers...")
|
batch_size,
|
||||||
except Exception as e:
|
collection_name=collection_name,
|
||||||
|
progress=print,
|
||||||
|
)
|
||||||
|
except RebuildCollectionError as e:
|
||||||
print(f"\n ERROR during rebuild: {e}")
|
print(f"\n ERROR during rebuild: {e}")
|
||||||
print(f" Only {filed}/{len(all_ids)} drawers were re-filed.")
|
print(" Rebuild aborted before completion.")
|
||||||
if os.path.exists(backup_path):
|
if e.live_replaced and os.path.exists(backup_path):
|
||||||
print(f" Restoring from backup: {backup_path}")
|
print(f" Restoring from backup: {backup_path}")
|
||||||
backend.delete_collection(palace_path, COLLECTION_NAME)
|
try:
|
||||||
|
_close_chroma_handles(palace_path, backend=backend)
|
||||||
|
_delete_collection_if_exists(backend, palace_path, collection_name)
|
||||||
shutil.copy2(backup_path, sqlite_path)
|
shutil.copy2(backup_path, sqlite_path)
|
||||||
print(" Backup restored. Palace is back to pre-repair state.")
|
print(" Backup restored. Palace is back to pre-repair state.")
|
||||||
else:
|
except Exception as restore_error:
|
||||||
|
print(f" Backup restore failed: {restore_error}")
|
||||||
|
print(f" Manual restore required from: {backup_path}")
|
||||||
|
elif e.live_replaced:
|
||||||
print(" No backup available. Re-mine from source files to recover.")
|
print(" No backup available. Re-mine from source files to recover.")
|
||||||
|
else:
|
||||||
|
print(" Live collection was not replaced; leaving the original palace untouched.")
|
||||||
raise
|
raise
|
||||||
|
|
||||||
print(f"\n Repair complete. {filed} drawers rebuilt.")
|
print(f"\n Repair complete. {filed} drawers rebuilt.")
|
||||||
@@ -506,7 +667,380 @@ def rebuild_index(palace_path=None, confirm_truncation_ok: bool = False):
|
|||||||
print(f"\n{'=' * 55}\n")
|
print(f"\n{'=' * 55}\n")
|
||||||
|
|
||||||
|
|
||||||
def status(palace_path=None) -> dict:
|
class RebuildPartialError(Exception):
|
||||||
|
"""Raised when ``rebuild_from_sqlite`` fails partway through upserts.
|
||||||
|
|
||||||
|
Carries enough state for the user (or CLI) to recover: the
|
||||||
|
per-collection counts that succeeded, the collection that failed,
|
||||||
|
the dest path holding the partial palace, and the archive path
|
||||||
|
(when an in-place rebuild had moved the original aside). Re-raises
|
||||||
|
the underlying chromadb error as ``__cause__``.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
message: str,
|
||||||
|
*,
|
||||||
|
partial_counts: dict[str, int],
|
||||||
|
failed_collection: str,
|
||||||
|
dest_palace: str,
|
||||||
|
archive_path: Optional[str],
|
||||||
|
):
|
||||||
|
super().__init__(message)
|
||||||
|
self.message = message
|
||||||
|
self.partial_counts = partial_counts
|
||||||
|
self.failed_collection = failed_collection
|
||||||
|
self.dest_palace = dest_palace
|
||||||
|
self.archive_path = archive_path
|
||||||
|
|
||||||
|
|
||||||
|
def _rebuild_one_collection(
|
||||||
|
*,
|
||||||
|
backend: ChromaBackend,
|
||||||
|
source_palace: str,
|
||||||
|
dest_palace: str,
|
||||||
|
collection_name: str,
|
||||||
|
batch_size: int,
|
||||||
|
archive_path: Optional[str],
|
||||||
|
counts_so_far: dict[str, int],
|
||||||
|
) -> int:
|
||||||
|
"""Stream rows for one collection from SQLite and upsert into a
|
||||||
|
freshly-created collection at ``dest_palace``. Returns rows
|
||||||
|
upserted. Raises :class:`RebuildPartialError` (with the underlying
|
||||||
|
chromadb exception as ``__cause__``) on any upsert failure so the
|
||||||
|
caller can stop the loop and print recovery instructions instead of
|
||||||
|
silently shipping a partial palace.
|
||||||
|
"""
|
||||||
|
ids: list[str] = []
|
||||||
|
docs: list[str] = []
|
||||||
|
metas: list[dict] = []
|
||||||
|
upserted = 0
|
||||||
|
col = None
|
||||||
|
|
||||||
|
def _flush() -> int:
|
||||||
|
nonlocal upserted
|
||||||
|
if not ids:
|
||||||
|
return upserted
|
||||||
|
col.upsert(ids=list(ids), documents=list(docs), metadatas=list(metas))
|
||||||
|
upserted += len(ids)
|
||||||
|
print(f" upserted {upserted}")
|
||||||
|
ids.clear()
|
||||||
|
docs.clear()
|
||||||
|
metas.clear()
|
||||||
|
return upserted
|
||||||
|
|
||||||
|
try:
|
||||||
|
# ``create_collection`` lives inside the try so a Chroma-side
|
||||||
|
# "Collection already exists" failure (which can happen when the
|
||||||
|
# process-wide System cache still holds a pre-archive schema) is
|
||||||
|
# reported as a structured ``RebuildPartialError`` carrying
|
||||||
|
# ``archive_path`` — instead of an unstructured exception that
|
||||||
|
# strands the user without recovery instructions.
|
||||||
|
col = backend.create_collection(dest_palace, collection_name)
|
||||||
|
|
||||||
|
for emb_id, doc, meta in extract_via_sqlite(source_palace, collection_name):
|
||||||
|
ids.append(emb_id)
|
||||||
|
docs.append(doc or "")
|
||||||
|
# chromadb 1.5.x rejects None entries in the metadatas list
|
||||||
|
# but accepts empty dicts. Mempalace drawers always carry at
|
||||||
|
# least wing/room, so this branch is defensive — corruption
|
||||||
|
# in embedding_metadata could yield an emb_id with no rows.
|
||||||
|
metas.append(meta if meta else {})
|
||||||
|
if len(ids) >= batch_size:
|
||||||
|
_flush()
|
||||||
|
_flush()
|
||||||
|
except Exception as exc: # noqa: BLE001 — chromadb raises many shapes
|
||||||
|
partial = dict(counts_so_far)
|
||||||
|
partial[collection_name] = upserted
|
||||||
|
msg_parts = [
|
||||||
|
f"Upsert failed in collection {collection_name!r} after {upserted} rows: {exc!r}",
|
||||||
|
f"Partial palace left at: {dest_palace}",
|
||||||
|
]
|
||||||
|
if archive_path is not None:
|
||||||
|
msg_parts.append(f"Original palace archived at: {archive_path}")
|
||||||
|
msg_parts.append(
|
||||||
|
" Recover by removing the partial dest and re-running with "
|
||||||
|
f"--source {archive_path}"
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
msg_parts.append(" Source palace is unchanged. Remove the partial dest and re-run.")
|
||||||
|
message = "\n ".join(msg_parts)
|
||||||
|
print(f"\n ERROR: {message}")
|
||||||
|
raise RebuildPartialError(
|
||||||
|
message,
|
||||||
|
partial_counts=partial,
|
||||||
|
failed_collection=collection_name,
|
||||||
|
dest_palace=dest_palace,
|
||||||
|
archive_path=archive_path,
|
||||||
|
) from exc
|
||||||
|
|
||||||
|
return upserted
|
||||||
|
|
||||||
|
|
||||||
|
def extract_via_sqlite(palace_path: str, collection_name: str) -> Iterator[tuple[str, str, dict]]:
|
||||||
|
"""Yield ``(embedding_id, document, metadata)`` for every row in
|
||||||
|
``collection_name``'s metadata segment by reading ``chroma.sqlite3``
|
||||||
|
directly.
|
||||||
|
|
||||||
|
Bypasses the chromadb client entirely — never opens a
|
||||||
|
``PersistentClient``, never imports hnswlib, never invokes the
|
||||||
|
HNSW segment writer. This is the recovery path for palaces where
|
||||||
|
``Collection.count()`` / ``Collection.get()`` raise ``InternalError``
|
||||||
|
because the compactor cannot apply WAL logs to the HNSW segment
|
||||||
|
(#1308). The drawer rows are still on disk in
|
||||||
|
``embeddings`` + ``embedding_metadata``; the corruption lives in the
|
||||||
|
on-disk index files, not the SQLite tables.
|
||||||
|
|
||||||
|
Resolution rule for chromadb's typed metadata columns: each
|
||||||
|
``embedding_metadata`` row stores its value in exactly one of
|
||||||
|
``string_value`` / ``int_value`` / ``float_value`` / ``bool_value``;
|
||||||
|
we pick the first non-NULL column in that order. Rows where every
|
||||||
|
typed column is NULL are dropped (chromadb never writes that shape).
|
||||||
|
The ``chroma:document`` key is removed from the metadata dict and
|
||||||
|
returned as the document; this matches how chromadb itself stores
|
||||||
|
``add(documents=...)``.
|
||||||
|
|
||||||
|
Silent on missing palace, missing ``chroma.sqlite3``, or unknown
|
||||||
|
collection name — yields nothing. Callers that need to distinguish
|
||||||
|
"empty collection" from "collection not present" should query
|
||||||
|
:func:`sqlite_drawer_count` first.
|
||||||
|
"""
|
||||||
|
sqlite_path = os.path.join(palace_path, "chroma.sqlite3")
|
||||||
|
if not os.path.isfile(sqlite_path):
|
||||||
|
return
|
||||||
|
|
||||||
|
conn = sqlite3.connect(f"file:{sqlite_path}?mode=ro", uri=True)
|
||||||
|
try:
|
||||||
|
seg_row = conn.execute(
|
||||||
|
"""
|
||||||
|
SELECT s.id FROM segments s
|
||||||
|
JOIN collections c ON s.collection = c.id
|
||||||
|
WHERE c.name = ? AND s.scope = 'METADATA'
|
||||||
|
""",
|
||||||
|
(collection_name,),
|
||||||
|
).fetchone()
|
||||||
|
if not seg_row:
|
||||||
|
return
|
||||||
|
segment_id = seg_row[0]
|
||||||
|
|
||||||
|
per_id: dict[str, dict] = defaultdict(dict)
|
||||||
|
order: list[str] = []
|
||||||
|
for emb_id, key, sv, iv, fv, bv in conn.execute(
|
||||||
|
"""
|
||||||
|
SELECT e.embedding_id, em.key, em.string_value, em.int_value,
|
||||||
|
em.float_value, em.bool_value
|
||||||
|
FROM embedding_metadata em
|
||||||
|
JOIN embeddings e ON em.id = e.id
|
||||||
|
WHERE e.segment_id = ?
|
||||||
|
ORDER BY em.id
|
||||||
|
""",
|
||||||
|
(segment_id,),
|
||||||
|
):
|
||||||
|
if emb_id not in per_id:
|
||||||
|
order.append(emb_id)
|
||||||
|
if sv is not None:
|
||||||
|
per_id[emb_id][key] = sv
|
||||||
|
elif iv is not None:
|
||||||
|
per_id[emb_id][key] = iv
|
||||||
|
elif fv is not None:
|
||||||
|
per_id[emb_id][key] = fv
|
||||||
|
elif bv is not None:
|
||||||
|
per_id[emb_id][key] = bool(bv)
|
||||||
|
|
||||||
|
for emb_id in order:
|
||||||
|
kv = per_id[emb_id]
|
||||||
|
doc = kv.pop("chroma:document", "")
|
||||||
|
yield emb_id, doc, kv
|
||||||
|
finally:
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
|
||||||
|
def rebuild_from_sqlite(
|
||||||
|
source_palace: str,
|
||||||
|
dest_palace: str,
|
||||||
|
*,
|
||||||
|
archive_existing_dest: bool = False,
|
||||||
|
batch_size: int = 1000,
|
||||||
|
) -> dict[str, int]:
|
||||||
|
"""Rebuild a palace by reading drawers from ``source_palace``'s
|
||||||
|
``chroma.sqlite3`` and upserting them into a fresh palace at
|
||||||
|
``dest_palace``.
|
||||||
|
|
||||||
|
Recovery path for the #1308 failure mode: the chromadb client raises
|
||||||
|
``InternalError: Failed to apply logs to the hnsw segment writer``
|
||||||
|
on every operation that touches the index (``count``, ``get``,
|
||||||
|
``query``), but the underlying SQLite tables are intact. Both the
|
||||||
|
legacy ``rebuild_index`` and the inline ``cli.cmd_repair`` path call
|
||||||
|
``Collection.count()`` as their first read — exactly the call that
|
||||||
|
fails — so neither can recover this class of corruption. This
|
||||||
|
function bypasses the chromadb read path entirely via
|
||||||
|
:func:`extract_via_sqlite`.
|
||||||
|
|
||||||
|
Re-embeds documents at upsert time using the configured embedding
|
||||||
|
function; the original HNSW vectors are not preserved (they live in
|
||||||
|
the corrupt ``data_level0.bin`` / ``link_lists.bin``, not in
|
||||||
|
SQLite). Acceptable for a corruption-recovery flow because the
|
||||||
|
embedding model is deterministic — same model + same document text
|
||||||
|
yields semantically equivalent search results.
|
||||||
|
|
||||||
|
``archive_existing_dest`` controls behavior when ``dest_palace``
|
||||||
|
already exists:
|
||||||
|
|
||||||
|
* ``False`` (default) — refuse with a clear message. Callers must
|
||||||
|
manually move the existing palace aside first.
|
||||||
|
* ``True`` — rename ``dest_palace`` to
|
||||||
|
``<dest_palace>.pre-rebuild-<timestamp>`` and read from there
|
||||||
|
instead. Used by the in-place CLI flow where ``--source`` defaults
|
||||||
|
to the same path as ``--palace``.
|
||||||
|
|
||||||
|
Returns a ``{collection_name: row_count}`` dict so callers (CLI,
|
||||||
|
tests) can verify the per-collection rebuild count without parsing
|
||||||
|
stdout. A successful rebuild always returns a dict with one key per
|
||||||
|
recoverable collection (values may be ``0`` when a collection is
|
||||||
|
legitimately empty in the source). The empty dict ``{}`` is reserved
|
||||||
|
for validation refusals (missing source DB, refusing to overwrite an
|
||||||
|
existing dest, in-place mode without ``archive_existing_dest``); CLI
|
||||||
|
callers should treat ``{}`` as an error and exit non-zero so CI and
|
||||||
|
scripts can distinguish "invalid inputs" from "successful recovery
|
||||||
|
that found zero rows." Raises :class:`RebuildPartialError` if a
|
||||||
|
chromadb upsert fails partway through; the dest palace is left in
|
||||||
|
place so the user can inspect what landed, and the in-place archive
|
||||||
|
(when applicable) is reported in the error so the user can re-run
|
||||||
|
against it.
|
||||||
|
|
||||||
|
.. warning::
|
||||||
|
|
||||||
|
In-place mode (``source_palace == dest_palace`` with
|
||||||
|
``archive_existing_dest=True``) calls
|
||||||
|
``chromadb.api.client.SharedSystemClient.clear_system_cache()`` to
|
||||||
|
drop chromadb's process-wide System registry — required because
|
||||||
|
an existing cached System built against the original palace will
|
||||||
|
refuse ``create_collection`` after the dir is renamed (chromadb
|
||||||
|
still thinks the collections exist). This invalidates any
|
||||||
|
PersistentClient instances held elsewhere in the same process for
|
||||||
|
*any* palace, not just this one. Do not call this function from
|
||||||
|
inside a long-running mempalace process (MCP server, daemon)
|
||||||
|
while other callers hold live ``PersistentClient`` references —
|
||||||
|
use the CLI in a separate process instead. Cross-palace use
|
||||||
|
(``source != dest``) does not touch the cache.
|
||||||
|
|
||||||
|
Note on metadata fidelity: the resolution rule
|
||||||
|
(``string_value`` → ``int_value`` → ``float_value`` → ``bool_value``)
|
||||||
|
matches the precedent in :mod:`mempalace.migrate`. ChromaDB 0.4.x
|
||||||
|
occasionally wrote booleans as ``int_value=0/1``; those will
|
||||||
|
round-trip as ``int`` rather than ``bool`` after this rebuild. This
|
||||||
|
is a known divergence and matches the existing migrate-path
|
||||||
|
behavior.
|
||||||
|
"""
|
||||||
|
source_palace = os.path.abspath(os.path.expanduser(source_palace))
|
||||||
|
dest_palace = os.path.abspath(os.path.expanduser(dest_palace))
|
||||||
|
|
||||||
|
src_db = os.path.join(source_palace, "chroma.sqlite3")
|
||||||
|
|
||||||
|
in_place = source_palace == dest_palace
|
||||||
|
|
||||||
|
print(f"\n{'=' * 55}")
|
||||||
|
print(" MemPalace Repair — Rebuild from SQLite")
|
||||||
|
print(f"{'=' * 55}\n")
|
||||||
|
print(f" Source: {source_palace}")
|
||||||
|
print(f" Dest: {dest_palace}")
|
||||||
|
|
||||||
|
# Validate source BEFORE any destructive moves. An earlier draft
|
||||||
|
# archived the dest first and surfaced the missing-chroma.sqlite3
|
||||||
|
# error after — leaving the user with a renamed dir to manually undo
|
||||||
|
# when the archive itself was empty. Validate first so a user error
|
||||||
|
# (--source pointing at a non-palace dir) bails cleanly.
|
||||||
|
if in_place:
|
||||||
|
if not archive_existing_dest:
|
||||||
|
print(
|
||||||
|
"\n Source and dest are the same path. Pass "
|
||||||
|
"archive_existing_dest=True (CLI: --archive-existing) to move "
|
||||||
|
"the existing palace aside, or pass a different source_palace= "
|
||||||
|
"(CLI: --source)."
|
||||||
|
)
|
||||||
|
return {}
|
||||||
|
if not os.path.isfile(src_db):
|
||||||
|
print(f"\n Source palace has no chroma.sqlite3 at {src_db}")
|
||||||
|
return {}
|
||||||
|
else:
|
||||||
|
if not os.path.isfile(src_db):
|
||||||
|
print(f"\n Source palace has no chroma.sqlite3 at {src_db}")
|
||||||
|
return {}
|
||||||
|
if os.path.exists(dest_palace):
|
||||||
|
print(
|
||||||
|
f"\n Refusing to rebuild into existing path: {dest_palace}\n"
|
||||||
|
" Move it aside, pass a different dest, or set "
|
||||||
|
"archive_existing_dest=True if rebuilding in place "
|
||||||
|
"(source_palace == dest_palace)."
|
||||||
|
)
|
||||||
|
return {}
|
||||||
|
|
||||||
|
archive_path: Optional[str] = None
|
||||||
|
if in_place:
|
||||||
|
ts = datetime.now().strftime("%Y%m%d-%H%M%S")
|
||||||
|
archive_path = f"{dest_palace}.pre-rebuild-{ts}"
|
||||||
|
print(f" Archiving {dest_palace} → {archive_path}")
|
||||||
|
shutil.move(dest_palace, archive_path)
|
||||||
|
source_palace = archive_path
|
||||||
|
src_db = os.path.join(source_palace, "chroma.sqlite3")
|
||||||
|
|
||||||
|
# In-place only: drop chromadb's process-wide System registry so
|
||||||
|
# the new client at dest_palace builds a fresh System. Without
|
||||||
|
# this, ``create_collection`` raises "Collection already exists"
|
||||||
|
# because the cached System still holds the pre-rename schema.
|
||||||
|
# Cross-palace mode does not need this and would needlessly
|
||||||
|
# invalidate other callers' clients (see docstring warning).
|
||||||
|
try:
|
||||||
|
from chromadb.api.client import SharedSystemClient
|
||||||
|
|
||||||
|
SharedSystemClient.clear_system_cache()
|
||||||
|
except Exception as exc: # noqa: BLE001
|
||||||
|
print(
|
||||||
|
f" Warning: could not clear chromadb system cache ({exc!r}); "
|
||||||
|
"in-place rebuild may fail with 'Collection already exists'."
|
||||||
|
)
|
||||||
|
|
||||||
|
os.makedirs(dest_palace, exist_ok=True)
|
||||||
|
|
||||||
|
# Backend lifetime is wrapped in try/finally so the dest palace's
|
||||||
|
# PersistentClient handle (opened lazily inside ``create_collection``
|
||||||
|
# / ``get_collection``) is released on every exit path: success,
|
||||||
|
# ``RebuildPartialError``, or any unexpected exception. Without this,
|
||||||
|
# a long-running process that calls ``rebuild_from_sqlite`` would
|
||||||
|
# leak SQLite/HNSW file handles into Chroma's ``SharedSystemClient``
|
||||||
|
# cache, surfacing later as "Collection already exists" on the next
|
||||||
|
# in-place rebuild or as a Windows file-lock failure on cleanup
|
||||||
|
# (cf. #1285's lifecycle hardening for the legacy rebuild path).
|
||||||
|
backend = ChromaBackend()
|
||||||
|
counts: dict[str, int] = {}
|
||||||
|
try:
|
||||||
|
for cname in _recoverable_collections():
|
||||||
|
print(f"\n [{cname}]")
|
||||||
|
upserted = _rebuild_one_collection(
|
||||||
|
backend=backend,
|
||||||
|
source_palace=source_palace,
|
||||||
|
dest_palace=dest_palace,
|
||||||
|
collection_name=cname,
|
||||||
|
batch_size=batch_size,
|
||||||
|
archive_path=archive_path,
|
||||||
|
counts_so_far=counts,
|
||||||
|
)
|
||||||
|
counts[cname] = upserted
|
||||||
|
if upserted == 0:
|
||||||
|
print(f" no rows found for {cname} in source palace")
|
||||||
|
else:
|
||||||
|
print(f" done: {upserted} rows in {cname}")
|
||||||
|
|
||||||
|
print(f"\n Rebuild complete. {sum(counts.values())} total rows.")
|
||||||
|
if archive_path is not None:
|
||||||
|
print(f" Original palace archived at: {archive_path}")
|
||||||
|
print(f"{'=' * 55}\n")
|
||||||
|
return counts
|
||||||
|
finally:
|
||||||
|
backend.close()
|
||||||
|
|
||||||
|
|
||||||
|
def status(palace_path=None, collection_name: Optional[str] = None) -> dict:
|
||||||
"""Read-only health check: compare sqlite vs HNSW element counts.
|
"""Read-only health check: compare sqlite vs HNSW element counts.
|
||||||
|
|
||||||
Catches the #1222 failure mode where chromadb's HNSW segment freezes
|
Catches the #1222 failure mode where chromadb's HNSW segment freezes
|
||||||
@@ -524,6 +1058,7 @@ def status(palace_path=None) -> dict:
|
|||||||
``status="unknown"`` when no palace exists at the given path.
|
``status="unknown"`` when no palace exists at the given path.
|
||||||
"""
|
"""
|
||||||
palace_path = palace_path or _get_palace_path()
|
palace_path = palace_path or _get_palace_path()
|
||||||
|
collection_name = collection_name or _drawers_collection_name()
|
||||||
print(f"\n{'=' * 55}")
|
print(f"\n{'=' * 55}")
|
||||||
print(" MemPalace Repair — Status")
|
print(" MemPalace Repair — Status")
|
||||||
print(f"{'=' * 55}\n")
|
print(f"{'=' * 55}\n")
|
||||||
@@ -533,8 +1068,8 @@ def status(palace_path=None) -> dict:
|
|||||||
print(" No palace found.\n")
|
print(" No palace found.\n")
|
||||||
return {"status": "unknown", "message": "no palace at path"}
|
return {"status": "unknown", "message": "no palace at path"}
|
||||||
|
|
||||||
drawers = hnsw_capacity_status(palace_path, "mempalace_drawers")
|
drawers = hnsw_capacity_status(palace_path, collection_name)
|
||||||
closets = hnsw_capacity_status(palace_path, "mempalace_closets")
|
closets = hnsw_capacity_status(palace_path, CLOSETS_COLLECTION_NAME)
|
||||||
|
|
||||||
for label, info in (("drawers", drawers), ("closets", closets)):
|
for label, info in (("drawers", drawers), ("closets", closets)):
|
||||||
print(f"\n [{label}]")
|
print(f"\n [{label}]")
|
||||||
@@ -564,12 +1099,18 @@ def status(palace_path=None) -> dict:
|
|||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
def _close_chroma_handles(palace_path: str) -> None:
|
def _close_chroma_handles(palace_path: str, backend: "ChromaBackend | None" = None) -> None:
|
||||||
"""Drop ChromaBackend + chromadb singleton caches so OS mmap handles release."""
|
"""Drop ChromaBackend + chromadb singleton caches so OS mmap handles release.
|
||||||
|
|
||||||
|
When ``backend`` is provided, close the live instance so rollback/restore
|
||||||
|
releases the handles it was already using. Otherwise fall back to a
|
||||||
|
transient backend instance for the max-seq-id repair path.
|
||||||
|
"""
|
||||||
import gc
|
import gc
|
||||||
|
|
||||||
try:
|
try:
|
||||||
ChromaBackend().close_palace(palace_path)
|
closer = backend if backend is not None else ChromaBackend()
|
||||||
|
closer.close_palace(palace_path)
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
try:
|
try:
|
||||||
|
|||||||
@@ -202,7 +202,7 @@ def detect_rooms_from_files(project_dir: str) -> list:
|
|||||||
|
|
||||||
SKIP_DIRS = {".git", "node_modules", "__pycache__", ".venv", "venv", "dist", "build"}
|
SKIP_DIRS = {".git", "node_modules", "__pycache__", ".venv", "venv", "dist", "build"}
|
||||||
|
|
||||||
for root, dirs, filenames in os.walk(project_path):
|
for _root, dirs, filenames in os.walk(project_path):
|
||||||
dirs[:] = [d for d in dirs if d not in SKIP_DIRS]
|
dirs[:] = [d for d in dirs if d not in SKIP_DIRS]
|
||||||
for filename in filenames:
|
for filename in filenames:
|
||||||
name_lower = filename.lower().replace("-", "_").replace(" ", "_")
|
name_lower = filename.lower().replace("-", "_").replace(" ", "_")
|
||||||
|
|||||||
+80
-25
@@ -245,7 +245,7 @@ def _expand_with_neighbors(drawers_col, matched_doc: str, matched_meta: dict, ra
|
|||||||
all_meta = drawers_col.get(where={"source_file": src}, include=["metadatas"])
|
all_meta = drawers_col.get(where={"source_file": src}, include=["metadatas"])
|
||||||
total_drawers = len(all_meta.ids) if all_meta.ids else None
|
total_drawers = len(all_meta.ids) if all_meta.ids else None
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
logger.debug("total_drawers lookup failed for %s", src, exc_info=True)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
"text": combined_text,
|
"text": combined_text,
|
||||||
@@ -297,10 +297,10 @@ def search(query: str, palace_path: str, wing: str = None, room: str = None, n_r
|
|||||||
"""
|
"""
|
||||||
try:
|
try:
|
||||||
col = get_collection(palace_path, create=False)
|
col = get_collection(palace_path, create=False)
|
||||||
except Exception:
|
except Exception as e:
|
||||||
print(f"\n No palace found at {palace_path}")
|
print(f"\n No palace found at {palace_path}")
|
||||||
print(" Run: mempalace init <dir> then mempalace mine <dir>")
|
print(" Run: mempalace init <dir> then mempalace mine <dir>")
|
||||||
raise SearchError(f"No palace found at {palace_path}")
|
raise SearchError(f"No palace found at {palace_path}") from e
|
||||||
|
|
||||||
# Alert the user if this palace predates hnsw:space=cosine being set on
|
# Alert the user if this palace predates hnsw:space=cosine being set on
|
||||||
# creation — their similarity scores will be junk until they run repair.
|
# creation — their similarity scores will be junk until they run repair.
|
||||||
@@ -340,7 +340,7 @@ def search(query: str, palace_path: str, wing: str = None, room: str = None, n_r
|
|||||||
# `_hybrid_rank`; do the same here so CLI results match what agents
|
# `_hybrid_rank`; do the same here so CLI results match what agents
|
||||||
# see via `mempalace_search`.
|
# see via `mempalace_search`.
|
||||||
hits = [
|
hits = [
|
||||||
{"text": doc, "distance": float(dist), "metadata": meta or {}}
|
{"text": doc or "", "distance": float(dist), "metadata": meta or {}}
|
||||||
for doc, meta, dist in zip(docs, metas, dists)
|
for doc, meta, dist in zip(docs, metas, dists)
|
||||||
]
|
]
|
||||||
hits = _hybrid_rank(hits, query)
|
hits = _hybrid_rank(hits, query)
|
||||||
@@ -382,6 +382,7 @@ def _bm25_only_via_sqlite(
|
|||||||
n_results: int = 5,
|
n_results: int = 5,
|
||||||
max_candidates: int = 500,
|
max_candidates: int = 500,
|
||||||
_include_internal: bool = False,
|
_include_internal: bool = False,
|
||||||
|
collection_name: str = None,
|
||||||
) -> dict:
|
) -> dict:
|
||||||
"""BM25-only search reading drawers directly from chroma.sqlite3.
|
"""BM25-only search reading drawers directly from chroma.sqlite3.
|
||||||
|
|
||||||
@@ -405,6 +406,35 @@ def _bm25_only_via_sqlite(
|
|||||||
"error": "No palace found",
|
"error": "No palace found",
|
||||||
"hint": "Run: mempalace init <dir> && mempalace mine <dir>",
|
"hint": "Run: mempalace init <dir> && mempalace mine <dir>",
|
||||||
}
|
}
|
||||||
|
if collection_name is None:
|
||||||
|
from .config import get_configured_collection_name
|
||||||
|
|
||||||
|
collection_name = get_configured_collection_name()
|
||||||
|
|
||||||
|
def _metadata_filter_sql(row_id_expr: str) -> tuple[str, list[str]]:
|
||||||
|
clauses = []
|
||||||
|
params = []
|
||||||
|
for key, value in (("wing", wing), ("room", room)):
|
||||||
|
if not value:
|
||||||
|
continue
|
||||||
|
clauses.append(
|
||||||
|
f"""
|
||||||
|
AND EXISTS (
|
||||||
|
SELECT 1
|
||||||
|
FROM embedding_metadata mf
|
||||||
|
WHERE mf.id = {row_id_expr}
|
||||||
|
AND mf.key = ?
|
||||||
|
AND COALESCE(
|
||||||
|
mf.string_value,
|
||||||
|
CAST(mf.int_value AS TEXT),
|
||||||
|
CAST(mf.float_value AS TEXT),
|
||||||
|
CAST(mf.bool_value AS TEXT)
|
||||||
|
) = ?
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
params.extend([key, value])
|
||||||
|
return "".join(clauses), params
|
||||||
|
|
||||||
try:
|
try:
|
||||||
conn = sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)
|
conn = sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)
|
||||||
@@ -416,45 +446,57 @@ def _bm25_only_via_sqlite(
|
|||||||
# shorter than 3 chars (trigram tokenizer can't match them).
|
# shorter than 3 chars (trigram tokenizer can't match them).
|
||||||
tokens = [t for t in _tokenize(query) if len(t) >= 3]
|
tokens = [t for t in _tokenize(query) if len(t) >= 3]
|
||||||
candidate_ids: list[int] = []
|
candidate_ids: list[int] = []
|
||||||
|
use_recency_fallback = not tokens
|
||||||
if tokens:
|
if tokens:
|
||||||
fts_query = " OR ".join(tokens)
|
fts_query = " OR ".join(tokens)
|
||||||
|
filter_sql, filter_params = _metadata_filter_sql("embedding_fulltext_search.rowid")
|
||||||
try:
|
try:
|
||||||
rows = conn.execute(
|
rows = conn.execute(
|
||||||
"""
|
f"""
|
||||||
SELECT rowid
|
SELECT embedding_fulltext_search.rowid
|
||||||
FROM embedding_fulltext_search
|
FROM embedding_fulltext_search
|
||||||
|
JOIN embeddings e ON e.id = embedding_fulltext_search.rowid
|
||||||
|
JOIN segments s ON e.segment_id = s.id
|
||||||
|
JOIN collections c ON s.collection = c.id
|
||||||
WHERE embedding_fulltext_search MATCH ?
|
WHERE embedding_fulltext_search MATCH ?
|
||||||
|
AND c.name = ?
|
||||||
|
{filter_sql}
|
||||||
LIMIT ?
|
LIMIT ?
|
||||||
""",
|
""",
|
||||||
(fts_query, max_candidates),
|
(fts_query, collection_name, *filter_params, max_candidates),
|
||||||
).fetchall()
|
).fetchall()
|
||||||
candidate_ids = [r[0] for r in rows]
|
candidate_ids = [r[0] for r in rows]
|
||||||
except sqlite3.Error:
|
except sqlite3.Error:
|
||||||
# FTS5 tokenizer mismatch or syntax error — fall through
|
# FTS5 tokenizer mismatch or syntax error — fall through
|
||||||
# to the recency-window selector below.
|
# to the recency-window selector below.
|
||||||
logger.debug("FTS5 MATCH failed; using recency fallback", exc_info=True)
|
logger.debug("FTS5 MATCH failed; using recency fallback", exc_info=True)
|
||||||
|
use_recency_fallback = True
|
||||||
|
|
||||||
if not candidate_ids:
|
if not candidate_ids and use_recency_fallback:
|
||||||
# No FTS hits (or no usable tokens) — pull the most recent
|
# No usable FTS tokens, or FTS itself failed — pull the most
|
||||||
# rows for the drawers segment so we can BM25-rank something
|
# recent rows for the drawers segment so we can BM25-rank
|
||||||
# rather than return empty-handed. Wrapped in try/except
|
# something rather than return empty-handed. A clean FTS miss
|
||||||
# because the schema may differ on legacy palaces (older
|
# must stay empty, especially after wing/room filtering, because
|
||||||
# chromadb without ``created_at``, missing ``segments``
|
# recency fallback would return unrelated scoped drawers.
|
||||||
# rows after partial restore, etc.); on schema mismatch we
|
# Wrapped in try/except because the schema may differ on legacy
|
||||||
# fall back to ordering by primary-key id and finally to an
|
# palaces (older chromadb without ``created_at``, missing
|
||||||
# empty result rather than letting search raise.
|
# ``segments`` rows after partial restore, etc.); on schema
|
||||||
|
# mismatch we fall back to ordering by primary-key id and finally
|
||||||
|
# to an empty result rather than letting search raise.
|
||||||
try:
|
try:
|
||||||
|
filter_sql, filter_params = _metadata_filter_sql("e.id")
|
||||||
rows = conn.execute(
|
rows = conn.execute(
|
||||||
"""
|
f"""
|
||||||
SELECT e.id
|
SELECT e.id
|
||||||
FROM embeddings e
|
FROM embeddings e
|
||||||
JOIN segments s ON e.segment_id = s.id
|
JOIN segments s ON e.segment_id = s.id
|
||||||
JOIN collections c ON s.collection = c.id
|
JOIN collections c ON s.collection = c.id
|
||||||
WHERE c.name = 'mempalace_drawers'
|
WHERE c.name = ?
|
||||||
|
{filter_sql}
|
||||||
ORDER BY e.created_at DESC
|
ORDER BY e.created_at DESC
|
||||||
LIMIT ?
|
LIMIT ?
|
||||||
""",
|
""",
|
||||||
(max_candidates,),
|
(collection_name, *filter_params, max_candidates),
|
||||||
).fetchall()
|
).fetchall()
|
||||||
candidate_ids = [r[0] for r in rows]
|
candidate_ids = [r[0] for r in rows]
|
||||||
except sqlite3.Error:
|
except sqlite3.Error:
|
||||||
@@ -463,17 +505,19 @@ def _bm25_only_via_sqlite(
|
|||||||
exc_info=True,
|
exc_info=True,
|
||||||
)
|
)
|
||||||
try:
|
try:
|
||||||
|
filter_sql, filter_params = _metadata_filter_sql("e.id")
|
||||||
rows = conn.execute(
|
rows = conn.execute(
|
||||||
"""
|
f"""
|
||||||
SELECT e.id
|
SELECT e.id
|
||||||
FROM embeddings e
|
FROM embeddings e
|
||||||
JOIN segments s ON e.segment_id = s.id
|
JOIN segments s ON e.segment_id = s.id
|
||||||
JOIN collections c ON s.collection = c.id
|
JOIN collections c ON s.collection = c.id
|
||||||
WHERE c.name = 'mempalace_drawers'
|
WHERE c.name = ?
|
||||||
|
{filter_sql}
|
||||||
ORDER BY e.id DESC
|
ORDER BY e.id DESC
|
||||||
LIMIT ?
|
LIMIT ?
|
||||||
""",
|
""",
|
||||||
(max_candidates,),
|
(collection_name, *filter_params, max_candidates),
|
||||||
).fetchall()
|
).fetchall()
|
||||||
candidate_ids = [r[0] for r in rows]
|
candidate_ids = [r[0] for r in rows]
|
||||||
except sqlite3.Error:
|
except sqlite3.Error:
|
||||||
@@ -689,6 +733,7 @@ def search_memories(
|
|||||||
max_distance: float = 0.0,
|
max_distance: float = 0.0,
|
||||||
vector_disabled: bool = False,
|
vector_disabled: bool = False,
|
||||||
candidate_strategy: str = "vector",
|
candidate_strategy: str = "vector",
|
||||||
|
collection_name: str = None,
|
||||||
) -> dict:
|
) -> dict:
|
||||||
"""Programmatic search — returns a dict instead of printing.
|
"""Programmatic search — returns a dict instead of printing.
|
||||||
|
|
||||||
@@ -739,10 +784,11 @@ def search_memories(
|
|||||||
wing=wing,
|
wing=wing,
|
||||||
room=room,
|
room=room,
|
||||||
n_results=n_results,
|
n_results=n_results,
|
||||||
|
collection_name=collection_name,
|
||||||
)
|
)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
drawers_col = get_collection(palace_path, create=False)
|
drawers_col = get_collection(palace_path, collection_name=collection_name, create=False)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error("No palace found at %s: %s", palace_path, e)
|
logger.error("No palace found at %s: %s", palace_path, e)
|
||||||
return {
|
return {
|
||||||
@@ -795,7 +841,8 @@ def search_memories(
|
|||||||
if source and source not in closet_boost_by_source:
|
if source and source not in closet_boost_by_source:
|
||||||
closet_boost_by_source[source] = (rank, cdist, cdoc[:200])
|
closet_boost_by_source[source] = (rank, cdist, cdoc[:200])
|
||||||
except Exception:
|
except Exception:
|
||||||
pass # no closets yet — hybrid degrades to pure drawer search
|
# No closets yet — hybrid degrades to pure drawer search.
|
||||||
|
logger.debug("Closet collection unavailable; using drawer-only search", exc_info=True)
|
||||||
|
|
||||||
# Rank-based boost. The ordinal signal ("which closet matched best") is
|
# Rank-based boost. The ordinal signal ("which closet matched best") is
|
||||||
# more reliable than absolute distance on narrative content, where
|
# more reliable than absolute distance on narrative content, where
|
||||||
@@ -809,6 +856,8 @@ def search_memories(
|
|||||||
_first_or_empty(drawer_results, "metadatas"),
|
_first_or_empty(drawer_results, "metadatas"),
|
||||||
_first_or_empty(drawer_results, "distances"),
|
_first_or_empty(drawer_results, "distances"),
|
||||||
):
|
):
|
||||||
|
meta = meta or {}
|
||||||
|
doc = doc or ""
|
||||||
# Filter on raw distance before rounding to avoid precision loss.
|
# Filter on raw distance before rounding to avoid precision loss.
|
||||||
if max_distance > 0.0 and dist > max_distance:
|
if max_distance > 0.0 and dist > max_distance:
|
||||||
continue
|
continue
|
||||||
@@ -825,7 +874,12 @@ def search_memories(
|
|||||||
matched_via = "drawer+closet"
|
matched_via = "drawer+closet"
|
||||||
closet_preview = c_preview
|
closet_preview = c_preview
|
||||||
|
|
||||||
effective_dist = dist - boost
|
# Clamp to the valid cosine-distance range [0, 2]. When a strong
|
||||||
|
# closet boost (up to 0.40) exceeds the raw distance, the subtraction
|
||||||
|
# can go negative — which (a) yields ``similarity > 1.0`` downstream
|
||||||
|
# and (b) makes the sort key land *below* ordinary positive distances,
|
||||||
|
# inverting the ranking so the best hybrid matches sort last.
|
||||||
|
effective_dist = max(0.0, min(2.0, dist - boost))
|
||||||
entry = {
|
entry = {
|
||||||
"text": doc,
|
"text": doc,
|
||||||
"wing": meta.get("wing", "unknown"),
|
"wing": meta.get("wing", "unknown"),
|
||||||
@@ -870,6 +924,7 @@ def search_memories(
|
|||||||
include=["documents", "metadatas"],
|
include=["documents", "metadatas"],
|
||||||
)
|
)
|
||||||
except Exception:
|
except Exception:
|
||||||
|
logger.debug("Neighbor fetch failed for %s", full_source, exc_info=True)
|
||||||
continue
|
continue
|
||||||
docs = source_drawers.documents
|
docs = source_drawers.documents
|
||||||
metas_ = source_drawers.metadatas
|
metas_ = source_drawers.metadatas
|
||||||
|
|||||||
@@ -40,8 +40,9 @@ def _patch_mcp_config(monkeypatch, palace_path, tmp_path):
|
|||||||
|
|
||||||
import mempalace.mcp_server as mcp_mod
|
import mempalace.mcp_server as mcp_mod
|
||||||
|
|
||||||
|
kg = KnowledgeGraph(db_path=str(tmp_path / "kg.sqlite3"))
|
||||||
monkeypatch.setattr(mcp_mod, "_config", cfg)
|
monkeypatch.setattr(mcp_mod, "_config", cfg)
|
||||||
monkeypatch.setattr(mcp_mod, "_kg", KnowledgeGraph(db_path=str(tmp_path / "kg.sqlite3")))
|
monkeypatch.setattr(mcp_mod, "_get_kg", lambda: kg)
|
||||||
|
|
||||||
|
|
||||||
def _get_rss_mb():
|
def _get_rss_mb():
|
||||||
|
|||||||
@@ -84,8 +84,9 @@ class TestToolStatusMemoryProfile:
|
|||||||
|
|
||||||
cfg = MempalaceConfig(config_dir=str(tmp_path / "cfg"))
|
cfg = MempalaceConfig(config_dir=str(tmp_path / "cfg"))
|
||||||
monkeypatch.setattr(cfg, "_file_config", {"palace_path": palace_path})
|
monkeypatch.setattr(cfg, "_file_config", {"palace_path": palace_path})
|
||||||
|
kg = KnowledgeGraph(db_path=str(tmp_path / "kg.sqlite3"))
|
||||||
monkeypatch.setattr(mcp_mod, "_config", cfg)
|
monkeypatch.setattr(mcp_mod, "_config", cfg)
|
||||||
monkeypatch.setattr(mcp_mod, "_kg", KnowledgeGraph(db_path=str(tmp_path / "kg.sqlite3")))
|
monkeypatch.setattr(mcp_mod, "_get_kg", lambda: kg)
|
||||||
|
|
||||||
from mempalace.mcp_server import tool_status
|
from mempalace.mcp_server import tool_status
|
||||||
|
|
||||||
|
|||||||
+346
-1
@@ -1,4 +1,6 @@
|
|||||||
import os
|
import os
|
||||||
|
import pickle
|
||||||
|
import shutil
|
||||||
import sqlite3
|
import sqlite3
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
@@ -18,6 +20,7 @@ from mempalace.backends.chroma import (
|
|||||||
ChromaCollection,
|
ChromaCollection,
|
||||||
_fix_blob_seq_ids,
|
_fix_blob_seq_ids,
|
||||||
_pin_hnsw_threads,
|
_pin_hnsw_threads,
|
||||||
|
quarantine_invalid_hnsw_metadata,
|
||||||
quarantine_stale_hnsw,
|
quarantine_stale_hnsw,
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -206,6 +209,52 @@ def test_query_empty_preserves_embeddings_outer_shape_when_requested():
|
|||||||
assert not_requested.embeddings is None
|
assert not_requested.embeddings is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_chroma_close_palace_releases_sqlite_lock_for_reopen(tmp_path):
|
||||||
|
"""close_palace must release chromadb's rust-side SQLite file lock so
|
||||||
|
a fresh PersistentClient on the same path after shutil.rmtree can
|
||||||
|
write without hitting SQLITE_READONLY_DBMOVED."""
|
||||||
|
backend = ChromaBackend()
|
||||||
|
palace_path = tmp_path / "palace-a"
|
||||||
|
ref = PalaceRef(id=str(palace_path), local_path=str(palace_path))
|
||||||
|
|
||||||
|
col = backend.get_collection(palace=ref, collection_name="mempalace_drawers", create=True)
|
||||||
|
col.upsert(documents=["hello"], ids=["a"], metadatas=[{"k": "v"}])
|
||||||
|
|
||||||
|
backend.close_palace(ref)
|
||||||
|
shutil.rmtree(palace_path)
|
||||||
|
|
||||||
|
col = backend.get_collection(palace=ref, collection_name="mempalace_drawers", create=True)
|
||||||
|
col.upsert(documents=["world"], ids=["b"], metadatas=[{"k": "v2"}])
|
||||||
|
assert col.count() == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_chroma_close_releases_all_cached_clients(tmp_path):
|
||||||
|
"""close() must release every cached client's SQLite file lock so any
|
||||||
|
of their palace paths can be reopened by a fresh backend in the same
|
||||||
|
process."""
|
||||||
|
backend = ChromaBackend()
|
||||||
|
palace_a = tmp_path / "palace-a"
|
||||||
|
palace_b = tmp_path / "palace-b"
|
||||||
|
ref_a = PalaceRef(id=str(palace_a), local_path=str(palace_a))
|
||||||
|
ref_b = PalaceRef(id=str(palace_b), local_path=str(palace_b))
|
||||||
|
|
||||||
|
for ref in (ref_a, ref_b):
|
||||||
|
backend.get_collection(palace=ref, collection_name="mempalace_drawers", create=True).upsert(
|
||||||
|
documents=["x"], ids=["x"], metadatas=[{"k": "v"}]
|
||||||
|
)
|
||||||
|
|
||||||
|
backend.close()
|
||||||
|
|
||||||
|
for path in (palace_a, palace_b):
|
||||||
|
shutil.rmtree(path)
|
||||||
|
ref = PalaceRef(id=str(path), local_path=str(path))
|
||||||
|
fresh = ChromaBackend()
|
||||||
|
col = fresh.get_collection(palace=ref, collection_name="mempalace_drawers", create=True)
|
||||||
|
col.upsert(documents=["y"], ids=["y"], metadatas=[{"k": "v2"}])
|
||||||
|
assert col.count() == 1
|
||||||
|
fresh.close()
|
||||||
|
|
||||||
|
|
||||||
def test_chroma_cache_invalidates_when_db_file_missing(tmp_path):
|
def test_chroma_cache_invalidates_when_db_file_missing(tmp_path):
|
||||||
"""A palace rebuild that removes chroma.sqlite3 must drop the stale cache.
|
"""A palace rebuild that removes chroma.sqlite3 must drop the stale cache.
|
||||||
|
|
||||||
@@ -708,7 +757,10 @@ def test_make_client_quarantines_only_on_first_call_per_palace(tmp_path, monkeyp
|
|||||||
"""Quarantine fires on first ``make_client()`` for a palace, then is
|
"""Quarantine fires on first ``make_client()`` for a palace, then is
|
||||||
skipped on subsequent calls — prevents runtime thrash where a daemon's
|
skipped on subsequent calls — prevents runtime thrash where a daemon's
|
||||||
own steady writes bump ``chroma.sqlite3`` faster than HNSW flushes,
|
own steady writes bump ``chroma.sqlite3`` faster than HNSW flushes,
|
||||||
making the mtime heuristic falsely trigger every reconnect."""
|
making the mtime heuristic falsely trigger every reconnect.
|
||||||
|
|
||||||
|
Invalid metadata quarantine shares the same cold-start gate here; the
|
||||||
|
more aggressive refresh path lives in ``_client()``."""
|
||||||
from mempalace.backends.chroma import ChromaBackend
|
from mempalace.backends.chroma import ChromaBackend
|
||||||
|
|
||||||
palace_path = str(tmp_path / "palace")
|
palace_path = str(tmp_path / "palace")
|
||||||
@@ -735,6 +787,34 @@ def test_make_client_quarantines_only_on_first_call_per_palace(tmp_path, monkeyp
|
|||||||
], "quarantine_stale_hnsw should fire once per palace per process, not on every reconnect"
|
], "quarantine_stale_hnsw should fire once per palace per process, not on every reconnect"
|
||||||
|
|
||||||
|
|
||||||
|
def test_make_client_gates_invalid_metadata_on_first_call(tmp_path, monkeypatch):
|
||||||
|
"""Invalid metadata quarantine is gated on the first make_client() call."""
|
||||||
|
from mempalace.backends.chroma import ChromaBackend
|
||||||
|
|
||||||
|
palace_path = str(tmp_path / "palace")
|
||||||
|
os.makedirs(palace_path, exist_ok=True)
|
||||||
|
(Path(palace_path) / "chroma.sqlite3").write_text("")
|
||||||
|
|
||||||
|
monkeypatch.setattr(ChromaBackend, "_quarantined_paths", set())
|
||||||
|
|
||||||
|
calls: list[str] = []
|
||||||
|
|
||||||
|
def _invalid(path, *args, **kwargs):
|
||||||
|
calls.append(path)
|
||||||
|
return []
|
||||||
|
|
||||||
|
def _stale(path, stale_seconds=300.0):
|
||||||
|
return []
|
||||||
|
|
||||||
|
monkeypatch.setattr("mempalace.backends.chroma.quarantine_invalid_hnsw_metadata", _invalid)
|
||||||
|
monkeypatch.setattr("mempalace.backends.chroma.quarantine_stale_hnsw", _stale)
|
||||||
|
|
||||||
|
ChromaBackend.make_client(palace_path)
|
||||||
|
ChromaBackend.make_client(palace_path)
|
||||||
|
|
||||||
|
assert calls == [palace_path]
|
||||||
|
|
||||||
|
|
||||||
def test_make_client_quarantines_each_palace_independently(tmp_path, monkeypatch):
|
def test_make_client_quarantines_each_palace_independently(tmp_path, monkeypatch):
|
||||||
"""Two distinct palaces each get one quarantine attempt — the gate is
|
"""Two distinct palaces each get one quarantine attempt — the gate is
|
||||||
keyed by palace path, not global."""
|
keyed by palace path, not global."""
|
||||||
@@ -872,3 +952,268 @@ def test_get_collection_applies_retrofit_on_existing_palace(tmp_path):
|
|||||||
)
|
)
|
||||||
|
|
||||||
assert wrapper._collection.configuration_json["hnsw"]["num_threads"] == 1
|
assert wrapper._collection.configuration_json["hnsw"]["num_threads"] == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_quarantine_invalid_hnsw_metadata_renames_missing_dimensionality(tmp_path):
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
palace.mkdir()
|
||||||
|
seg = palace / "abcd-1234-5678"
|
||||||
|
seg.mkdir()
|
||||||
|
with open(seg / "index_metadata.pickle", "wb") as f:
|
||||||
|
pickle.dump({"dimensionality": None, "id_to_label": {"a": 1}}, f)
|
||||||
|
|
||||||
|
moved = quarantine_invalid_hnsw_metadata(str(palace))
|
||||||
|
|
||||||
|
assert len(moved) == 1
|
||||||
|
assert ".corrupt-" in moved[0]
|
||||||
|
assert not seg.exists()
|
||||||
|
|
||||||
|
|
||||||
|
def test_quarantine_invalid_hnsw_metadata_allows_uninitialized_segment(tmp_path):
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
palace.mkdir()
|
||||||
|
seg = palace / "abcd-1234-5678"
|
||||||
|
seg.mkdir()
|
||||||
|
with open(seg / "index_metadata.pickle", "wb") as f:
|
||||||
|
pickle.dump({"dimensionality": None, "id_to_label": {}}, f)
|
||||||
|
|
||||||
|
moved = quarantine_invalid_hnsw_metadata(str(palace))
|
||||||
|
|
||||||
|
assert moved == []
|
||||||
|
assert seg.exists()
|
||||||
|
|
||||||
|
|
||||||
|
def test_quarantine_invalid_hnsw_metadata_rejects_non_dict_id_to_label(tmp_path):
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
palace.mkdir()
|
||||||
|
seg = palace / "abcd-1234-5678"
|
||||||
|
seg.mkdir()
|
||||||
|
with open(seg / "index_metadata.pickle", "wb") as f:
|
||||||
|
pickle.dump({"dimensionality": 8, "id_to_label": ["a", "b"]}, f)
|
||||||
|
|
||||||
|
moved = quarantine_invalid_hnsw_metadata(str(palace))
|
||||||
|
|
||||||
|
assert len(moved) == 1
|
||||||
|
assert ".corrupt-" in moved[0]
|
||||||
|
assert not seg.exists()
|
||||||
|
|
||||||
|
|
||||||
|
def test_quarantine_invalid_hnsw_metadata_rejects_non_schema_payload(tmp_path):
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
palace.mkdir()
|
||||||
|
seg = palace / "abcd-1234-5678"
|
||||||
|
seg.mkdir()
|
||||||
|
with open(seg / "index_metadata.pickle", "wb") as f:
|
||||||
|
pickle.dump(["not", "a", "metadata", "object"], f)
|
||||||
|
|
||||||
|
moved = quarantine_invalid_hnsw_metadata(str(palace))
|
||||||
|
|
||||||
|
assert len(moved) == 1
|
||||||
|
assert ".corrupt-" in moved[0]
|
||||||
|
assert not seg.exists()
|
||||||
|
|
||||||
|
|
||||||
|
def _dangerous_pickle_payload_executed():
|
||||||
|
raise AssertionError("unsafe pickle payload executed")
|
||||||
|
|
||||||
|
|
||||||
|
class _DangerousPickle:
|
||||||
|
def __reduce__(self):
|
||||||
|
return (_dangerous_pickle_payload_executed, ())
|
||||||
|
|
||||||
|
|
||||||
|
def test_quarantine_invalid_hnsw_metadata_rejects_unsafe_pickle(tmp_path):
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
palace.mkdir()
|
||||||
|
seg = palace / "abcd-1234-5678"
|
||||||
|
seg.mkdir()
|
||||||
|
with open(seg / "index_metadata.pickle", "wb") as f:
|
||||||
|
pickle.dump(_DangerousPickle(), f)
|
||||||
|
|
||||||
|
moved = quarantine_invalid_hnsw_metadata(str(palace))
|
||||||
|
|
||||||
|
assert len(moved) == 1
|
||||||
|
assert ".corrupt-" in moved[0]
|
||||||
|
assert not seg.exists()
|
||||||
|
|
||||||
|
|
||||||
|
def test_quarantine_invalid_hnsw_metadata_skips_transient_read_errors(tmp_path, monkeypatch):
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
palace.mkdir()
|
||||||
|
seg = palace / "abcd-1234-5678"
|
||||||
|
seg.mkdir()
|
||||||
|
meta = seg / "index_metadata.pickle"
|
||||||
|
meta.write_bytes(b"partial")
|
||||||
|
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"mempalace.backends.chroma._SafePersistentDataUnpickler.load",
|
||||||
|
lambda path: (_ for _ in ()).throw(EOFError("flush in progress")),
|
||||||
|
)
|
||||||
|
|
||||||
|
moved = quarantine_invalid_hnsw_metadata(str(palace))
|
||||||
|
|
||||||
|
assert moved == []
|
||||||
|
assert seg.exists()
|
||||||
|
|
||||||
|
|
||||||
|
def test_quarantine_invalid_hnsw_metadata_skips_truncated_pickle(tmp_path, monkeypatch):
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
palace.mkdir()
|
||||||
|
seg = palace / "abcd-1234-5678"
|
||||||
|
seg.mkdir()
|
||||||
|
meta = seg / "index_metadata.pickle"
|
||||||
|
meta.write_bytes(b"partial")
|
||||||
|
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"mempalace.backends.chroma._SafePersistentDataUnpickler.load",
|
||||||
|
lambda path: (_ for _ in ()).throw(pickle.UnpicklingError("pickle data was truncated")),
|
||||||
|
)
|
||||||
|
|
||||||
|
moved = quarantine_invalid_hnsw_metadata(str(palace))
|
||||||
|
|
||||||
|
assert moved == []
|
||||||
|
assert seg.exists()
|
||||||
|
|
||||||
|
|
||||||
|
def test_chroma_backend_preflights_metadata_before_persistent_client(tmp_path, monkeypatch):
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
palace.mkdir()
|
||||||
|
calls = []
|
||||||
|
|
||||||
|
def _record(name):
|
||||||
|
def inner(path, *args, **kwargs):
|
||||||
|
calls.append((name, path))
|
||||||
|
return [] if name != "blob" else None
|
||||||
|
|
||||||
|
return inner
|
||||||
|
|
||||||
|
monkeypatch.setattr("mempalace.backends.chroma._fix_blob_seq_ids", _record("blob"))
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"mempalace.backends.chroma.quarantine_invalid_hnsw_metadata", _record("invalid")
|
||||||
|
)
|
||||||
|
monkeypatch.setattr("mempalace.backends.chroma.quarantine_stale_hnsw", _record("stale"))
|
||||||
|
|
||||||
|
class DummyClient:
|
||||||
|
pass
|
||||||
|
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"mempalace.backends.chroma.chromadb.PersistentClient", lambda path: DummyClient()
|
||||||
|
)
|
||||||
|
|
||||||
|
backend = ChromaBackend()
|
||||||
|
backend._client(str(palace))
|
||||||
|
|
||||||
|
assert calls == [
|
||||||
|
("blob", str(palace)),
|
||||||
|
("invalid", str(palace)),
|
||||||
|
("stale", str(palace)),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def test_chroma_backend_stale_quarantine_is_cold_start_only_on_refresh(tmp_path, monkeypatch):
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
palace.mkdir()
|
||||||
|
(palace / "chroma.sqlite3").write_text("")
|
||||||
|
calls = []
|
||||||
|
|
||||||
|
def _record(name):
|
||||||
|
def inner(path, *args, **kwargs):
|
||||||
|
calls.append((name, path))
|
||||||
|
return [] if name != "blob" else None
|
||||||
|
|
||||||
|
return inner
|
||||||
|
|
||||||
|
monkeypatch.setattr(ChromaBackend, "_quarantined_paths", set())
|
||||||
|
monkeypatch.setattr("mempalace.backends.chroma._fix_blob_seq_ids", _record("blob"))
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"mempalace.backends.chroma.quarantine_invalid_hnsw_metadata", _record("invalid")
|
||||||
|
)
|
||||||
|
monkeypatch.setattr("mempalace.backends.chroma.quarantine_stale_hnsw", _record("stale"))
|
||||||
|
|
||||||
|
class DummyClient:
|
||||||
|
pass
|
||||||
|
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"mempalace.backends.chroma.chromadb.PersistentClient", lambda path: DummyClient()
|
||||||
|
)
|
||||||
|
|
||||||
|
backend = ChromaBackend()
|
||||||
|
stats = iter([(1, 1.0), (1, 1.0), (1, 2.0), (1, 2.0)])
|
||||||
|
monkeypatch.setattr(backend, "_db_stat", lambda path: next(stats))
|
||||||
|
|
||||||
|
backend._client(str(palace))
|
||||||
|
backend._client(str(palace))
|
||||||
|
|
||||||
|
assert calls == [
|
||||||
|
("blob", str(palace)),
|
||||||
|
("invalid", str(palace)),
|
||||||
|
("stale", str(palace)),
|
||||||
|
("blob", str(palace)),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def test_chroma_backend_requarantines_after_inode_replacement(tmp_path, monkeypatch):
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
palace.mkdir()
|
||||||
|
(palace / "chroma.sqlite3").write_text("")
|
||||||
|
calls = []
|
||||||
|
|
||||||
|
def _record(name):
|
||||||
|
def inner(path, *args, **kwargs):
|
||||||
|
calls.append((name, path))
|
||||||
|
return [] if name != "blob" else None
|
||||||
|
|
||||||
|
return inner
|
||||||
|
|
||||||
|
monkeypatch.setattr(ChromaBackend, "_quarantined_paths", set())
|
||||||
|
monkeypatch.setattr("mempalace.backends.chroma._fix_blob_seq_ids", _record("blob"))
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"mempalace.backends.chroma.quarantine_invalid_hnsw_metadata", _record("invalid")
|
||||||
|
)
|
||||||
|
monkeypatch.setattr("mempalace.backends.chroma.quarantine_stale_hnsw", _record("stale"))
|
||||||
|
|
||||||
|
class DummyClient:
|
||||||
|
pass
|
||||||
|
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"mempalace.backends.chroma.chromadb.PersistentClient", lambda path: DummyClient()
|
||||||
|
)
|
||||||
|
|
||||||
|
backend = ChromaBackend()
|
||||||
|
stats = iter([(1, 1.0), (1, 1.0), (2, 2.0), (2, 2.0)])
|
||||||
|
monkeypatch.setattr(backend, "_db_stat", lambda path: next(stats))
|
||||||
|
|
||||||
|
backend._client(str(palace))
|
||||||
|
backend._client(str(palace))
|
||||||
|
|
||||||
|
assert calls == [
|
||||||
|
("blob", str(palace)),
|
||||||
|
("invalid", str(palace)),
|
||||||
|
("stale", str(palace)),
|
||||||
|
("blob", str(palace)),
|
||||||
|
("invalid", str(palace)),
|
||||||
|
("stale", str(palace)),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def test_palace_get_collection_uses_configured_collection_name(monkeypatch):
|
||||||
|
from mempalace import palace
|
||||||
|
|
||||||
|
captured = {}
|
||||||
|
|
||||||
|
def fake_get_collection(palace_path, collection_name=None, create=False):
|
||||||
|
captured["palace_path"] = palace_path
|
||||||
|
captured["collection_name"] = collection_name
|
||||||
|
captured["create"] = create
|
||||||
|
return object()
|
||||||
|
|
||||||
|
monkeypatch.setattr(palace._DEFAULT_BACKEND, "get_collection", fake_get_collection)
|
||||||
|
monkeypatch.setattr("mempalace.config.get_configured_collection_name", lambda: "custom_drawers")
|
||||||
|
|
||||||
|
palace.get_collection("/palace", create=False)
|
||||||
|
|
||||||
|
assert captured == {
|
||||||
|
"palace_path": "/palace",
|
||||||
|
"collection_name": "custom_drawers",
|
||||||
|
"create": False,
|
||||||
|
}
|
||||||
|
|||||||
@@ -0,0 +1,321 @@
|
|||||||
|
"""Tests for ChromaCollection's palace-write-lock integration.
|
||||||
|
|
||||||
|
Closes the gap left by ``mine_palace_lock`` only protecting the
|
||||||
|
``mempalace mine`` pipeline: MCP/direct writers that call
|
||||||
|
``ChromaCollection.add/upsert/update/delete`` must also serialize against
|
||||||
|
mine and against each other to avoid the multi-threaded HNSW corruption
|
||||||
|
documented in #974/#965.
|
||||||
|
|
||||||
|
Property tested:
|
||||||
|
|
||||||
|
* ``ChromaCollection(c, palace_path=p)`` wraps every write with
|
||||||
|
``mine_palace_lock(p)``.
|
||||||
|
* Writes raise ``MineAlreadyRunning`` when another holder owns the lock
|
||||||
|
(instead of silently racing into the underlying chromadb call).
|
||||||
|
* Re-entrant composition with ``miner.mine()`` does not self-deadlock:
|
||||||
|
``with mine_palace_lock(p): col.upsert(...)`` runs to completion.
|
||||||
|
* ``ChromaCollection(c)`` (no palace_path) preserves legacy no-lock
|
||||||
|
behaviour for tests/callers that build the adapter directly without
|
||||||
|
going through ``ChromaBackend``.
|
||||||
|
|
||||||
|
POSIX-only: ``mine_palace_lock`` uses ``fcntl`` on Unix and ``msvcrt`` on
|
||||||
|
Windows; the contention semantics differ enough that the cross-process
|
||||||
|
tests are skipped on Windows runners.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import multiprocessing
|
||||||
|
import os
|
||||||
|
import time
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from mempalace.backends.chroma import ChromaCollection
|
||||||
|
from mempalace.palace import MineAlreadyRunning, mine_palace_lock
|
||||||
|
|
||||||
|
|
||||||
|
def _get_mp_context():
|
||||||
|
"""Same start-method picker as test_palace_locks.py."""
|
||||||
|
start_method = "spawn" if os.name == "nt" else "fork"
|
||||||
|
return multiprocessing.get_context(start_method)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Fakes
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class _FakeChromaCollection:
|
||||||
|
"""Records calls; never blocks. Stand-in for chromadb.Collection."""
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
self.adds: list[dict] = []
|
||||||
|
self.upserts: list[dict] = []
|
||||||
|
self.updates: list[dict] = []
|
||||||
|
self.deletes: list[dict] = []
|
||||||
|
|
||||||
|
def add(self, **kwargs):
|
||||||
|
self.adds.append(kwargs)
|
||||||
|
|
||||||
|
def upsert(self, **kwargs):
|
||||||
|
self.upserts.append(kwargs)
|
||||||
|
|
||||||
|
def update(self, **kwargs):
|
||||||
|
self.updates.append(kwargs)
|
||||||
|
|
||||||
|
def delete(self, **kwargs):
|
||||||
|
self.deletes.append(kwargs)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def _hold_lock(palace_path: str, ready_flag: str, release_flag: str) -> int:
|
||||||
|
"""Acquire ``mine_palace_lock``, signal readiness, wait for release.
|
||||||
|
|
||||||
|
Mirrors the helper in ``test_palace_locks.py`` so the contention
|
||||||
|
semantics match across both test files.
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
with mine_palace_lock(palace_path):
|
||||||
|
open(ready_flag, "w").close()
|
||||||
|
for _ in range(500):
|
||||||
|
if os.path.exists(release_flag):
|
||||||
|
return 0
|
||||||
|
time.sleep(0.01)
|
||||||
|
return 0
|
||||||
|
except MineAlreadyRunning:
|
||||||
|
return 1
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Tests — opt-in lock wiring
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_palace_path_none_skips_lock(tmp_path, monkeypatch):
|
||||||
|
"""Legacy callers (``ChromaCollection(c)``) keep no-lock behaviour.
|
||||||
|
|
||||||
|
A ``ChromaCollection`` built without ``palace_path`` must not touch the
|
||||||
|
lock infrastructure at all. This guards against regressions where a
|
||||||
|
test or third-party caller relies on the historical bare-write path.
|
||||||
|
"""
|
||||||
|
monkeypatch.setenv("HOME", str(tmp_path))
|
||||||
|
fake = _FakeChromaCollection()
|
||||||
|
col = ChromaCollection(fake) # no palace_path -> no lock
|
||||||
|
|
||||||
|
# Hold the lock in a child process. Without palace_path, the parent
|
||||||
|
# write must still succeed (the lock does not gate this caller).
|
||||||
|
palace = str(tmp_path / "palace")
|
||||||
|
ready = str(tmp_path / "ready")
|
||||||
|
release = str(tmp_path / "release")
|
||||||
|
ctx = _get_mp_context()
|
||||||
|
holder = ctx.Process(target=_hold_lock, args=(palace, ready, release))
|
||||||
|
holder.start()
|
||||||
|
try:
|
||||||
|
for _ in range(500):
|
||||||
|
if os.path.exists(ready):
|
||||||
|
break
|
||||||
|
time.sleep(0.01)
|
||||||
|
assert os.path.exists(ready), "holder failed to acquire lock"
|
||||||
|
|
||||||
|
col.upsert(documents=["doc"], ids=["id-1"])
|
||||||
|
assert fake.upserts == [{"documents": ["doc"], "ids": ["id-1"]}]
|
||||||
|
finally:
|
||||||
|
open(release, "w").close()
|
||||||
|
holder.join(timeout=5)
|
||||||
|
|
||||||
|
|
||||||
|
def test_writer_blocks_during_mine(tmp_path, monkeypatch):
|
||||||
|
"""A held ``mine_palace_lock`` causes ``ChromaCollection`` writes to raise.
|
||||||
|
|
||||||
|
This is the property that closes the MCP-bypass gap: when a mine is in
|
||||||
|
flight, MCP/direct writes raise ``MineAlreadyRunning`` rather than
|
||||||
|
silently entering chromadb's write path concurrent with mine.
|
||||||
|
"""
|
||||||
|
monkeypatch.setenv("HOME", str(tmp_path))
|
||||||
|
palace = str(tmp_path / "palace")
|
||||||
|
ready = str(tmp_path / "ready")
|
||||||
|
release = str(tmp_path / "release")
|
||||||
|
|
||||||
|
ctx = _get_mp_context()
|
||||||
|
holder = ctx.Process(target=_hold_lock, args=(palace, ready, release))
|
||||||
|
holder.start()
|
||||||
|
try:
|
||||||
|
for _ in range(500):
|
||||||
|
if os.path.exists(ready):
|
||||||
|
break
|
||||||
|
time.sleep(0.01)
|
||||||
|
assert os.path.exists(ready), "holder failed to acquire lock"
|
||||||
|
|
||||||
|
fake = _FakeChromaCollection()
|
||||||
|
col = ChromaCollection(fake, palace_path=palace)
|
||||||
|
|
||||||
|
with pytest.raises(MineAlreadyRunning):
|
||||||
|
col.upsert(documents=["doc"], ids=["id-1"])
|
||||||
|
with pytest.raises(MineAlreadyRunning):
|
||||||
|
col.add(documents=["doc"], ids=["id-2"])
|
||||||
|
with pytest.raises(MineAlreadyRunning):
|
||||||
|
col.update(ids=["id-3"], documents=["doc"])
|
||||||
|
with pytest.raises(MineAlreadyRunning):
|
||||||
|
col.delete(ids=["id-4"])
|
||||||
|
|
||||||
|
# The fake must have received NO calls — the lock must gate
|
||||||
|
# before reaching the underlying chromadb layer.
|
||||||
|
assert fake.upserts == []
|
||||||
|
assert fake.adds == []
|
||||||
|
assert fake.updates == []
|
||||||
|
assert fake.deletes == []
|
||||||
|
finally:
|
||||||
|
open(release, "w").close()
|
||||||
|
holder.join(timeout=5)
|
||||||
|
|
||||||
|
|
||||||
|
def test_reentrant_inside_mine_passes_through(tmp_path, monkeypatch):
|
||||||
|
"""``ChromaCollection.upsert`` inside ``mine_palace_lock`` does not deadlock.
|
||||||
|
|
||||||
|
``miner.mine()`` already holds ``mine_palace_lock(palace_path)`` for the
|
||||||
|
full mine pipeline; ``_mine_body`` then calls
|
||||||
|
``collection.upsert(...)``. With the per-thread re-entrant guard in
|
||||||
|
``mine_palace_lock``, the inner acquire is a pass-through and the
|
||||||
|
underlying chromadb call runs immediately.
|
||||||
|
"""
|
||||||
|
monkeypatch.setenv("HOME", str(tmp_path))
|
||||||
|
palace = str(tmp_path / "palace")
|
||||||
|
fake = _FakeChromaCollection()
|
||||||
|
col = ChromaCollection(fake, palace_path=palace)
|
||||||
|
|
||||||
|
with mine_palace_lock(palace):
|
||||||
|
# If the re-entrant guard were missing, this would self-deadlock on
|
||||||
|
# the underlying flock. We rely on pytest-timeout (configured in
|
||||||
|
# pyproject.toml) to enforce this in CI; the assertion just confirms
|
||||||
|
# the call landed.
|
||||||
|
col.upsert(documents=["d"], ids=["i"], metadatas=[{"k": "v"}])
|
||||||
|
col.add(documents=["d2"], ids=["i2"])
|
||||||
|
col.update(ids=["i"], documents=["d-updated"])
|
||||||
|
col.delete(ids=["i2"])
|
||||||
|
|
||||||
|
assert len(fake.upserts) == 1
|
||||||
|
assert len(fake.adds) == 1
|
||||||
|
assert len(fake.updates) == 1
|
||||||
|
assert len(fake.deletes) == 1
|
||||||
|
|
||||||
|
|
||||||
|
class _SlowFakeChromaCollection(_FakeChromaCollection):
|
||||||
|
"""Fake whose write methods hold the caller for ``hold_seconds``.
|
||||||
|
|
||||||
|
Used to keep ``mine_palace_lock`` acquired long enough for a sibling
|
||||||
|
process to contend deterministically.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, hold_seconds: float = 0.3):
|
||||||
|
super().__init__()
|
||||||
|
self._hold = hold_seconds
|
||||||
|
|
||||||
|
def upsert(self, **kwargs):
|
||||||
|
time.sleep(self._hold)
|
||||||
|
super().upsert(**kwargs)
|
||||||
|
|
||||||
|
|
||||||
|
def _slow_writer_target(palace_path, tmp_path_str, pid, result_q):
|
||||||
|
"""Subprocess target: try a slow upsert, report ok/busy."""
|
||||||
|
os.environ["HOME"] = tmp_path_str
|
||||||
|
# Fresh import inside child so HOME monkeypatch routes the lock dir.
|
||||||
|
from mempalace.backends.chroma import ChromaCollection as _CC
|
||||||
|
from mempalace.palace import MineAlreadyRunning as _MAR
|
||||||
|
|
||||||
|
fake = _SlowFakeChromaCollection(hold_seconds=0.3)
|
||||||
|
col = _CC(fake, palace_path=palace_path)
|
||||||
|
try:
|
||||||
|
col.upsert(documents=[f"d{pid}"], ids=[f"i{pid}"])
|
||||||
|
result_q.put(("ok", pid))
|
||||||
|
except _MAR:
|
||||||
|
result_q.put(("busy", pid))
|
||||||
|
|
||||||
|
|
||||||
|
def test_concurrent_writers_serialize(tmp_path, monkeypatch):
|
||||||
|
"""Two processes calling ``ChromaCollection.upsert`` against the same
|
||||||
|
palace must be serialized: at most one enters chromadb at a time, the
|
||||||
|
other raises ``MineAlreadyRunning``.
|
||||||
|
|
||||||
|
This is the property that prevents the parallel HNSW insert race that
|
||||||
|
drives #974/#965 — under concurrent MCP write fan-out, exactly one
|
||||||
|
writer reaches chromadb and the rest fail loudly instead of corrupting
|
||||||
|
the index.
|
||||||
|
|
||||||
|
The slow fake holds the lock for 0.3s per writer, large enough for the
|
||||||
|
second process to contend even on slow CI runners.
|
||||||
|
"""
|
||||||
|
monkeypatch.setenv("HOME", str(tmp_path))
|
||||||
|
palace = str(tmp_path / "palace")
|
||||||
|
|
||||||
|
ctx = _get_mp_context()
|
||||||
|
result_q = ctx.Queue()
|
||||||
|
|
||||||
|
p1 = ctx.Process(target=_slow_writer_target, args=(palace, str(tmp_path), 1, result_q))
|
||||||
|
p2 = ctx.Process(target=_slow_writer_target, args=(palace, str(tmp_path), 2, result_q))
|
||||||
|
p1.start()
|
||||||
|
# Tiny stagger so p1 wins the race deterministically; without it the
|
||||||
|
# OS scheduler can pick either, which is also a valid outcome but
|
||||||
|
# makes the assertion brittle on slow CI.
|
||||||
|
time.sleep(0.05)
|
||||||
|
p2.start()
|
||||||
|
p1.join(timeout=5)
|
||||||
|
p2.join(timeout=5)
|
||||||
|
|
||||||
|
outcomes = [result_q.get(timeout=1) for _ in range(2)]
|
||||||
|
statuses = sorted(o[0] for o in outcomes)
|
||||||
|
assert statuses == ["busy", "ok"], f"expected one ok + one busy, got {outcomes}"
|
||||||
|
|
||||||
|
|
||||||
|
def test_read_path_does_not_acquire_lock(tmp_path, monkeypatch):
|
||||||
|
"""``query`` / ``get`` / ``count`` must not be gated by the write lock.
|
||||||
|
|
||||||
|
Read traffic is the dominant workload (semantic search, MCP get, etc.)
|
||||||
|
and serializing it against mine would tank latency for no correctness
|
||||||
|
benefit. This test pins that property: with another process holding
|
||||||
|
the write lock, reads must still complete instantly.
|
||||||
|
"""
|
||||||
|
monkeypatch.setenv("HOME", str(tmp_path))
|
||||||
|
palace = str(tmp_path / "palace")
|
||||||
|
ready = str(tmp_path / "ready")
|
||||||
|
release = str(tmp_path / "release")
|
||||||
|
|
||||||
|
ctx = _get_mp_context()
|
||||||
|
holder = ctx.Process(target=_hold_lock, args=(palace, ready, release))
|
||||||
|
holder.start()
|
||||||
|
try:
|
||||||
|
for _ in range(500):
|
||||||
|
if os.path.exists(ready):
|
||||||
|
break
|
||||||
|
time.sleep(0.01)
|
||||||
|
assert os.path.exists(ready), "holder failed to acquire lock"
|
||||||
|
|
||||||
|
# _FakeChromaCollection doesn't implement query/get/count; we only
|
||||||
|
# need to confirm the wrapper does not call into mine_palace_lock
|
||||||
|
# for reads, which we assert by observing the wrapped methods are
|
||||||
|
# NOT in ChromaCollection's _write_lock path. A direct check via
|
||||||
|
# source inspection is more honest than mocking the entire chroma
|
||||||
|
# surface here.
|
||||||
|
import inspect
|
||||||
|
|
||||||
|
from mempalace.backends.chroma import ChromaCollection as _CC
|
||||||
|
|
||||||
|
for write_attr in ("add", "upsert", "update", "delete"):
|
||||||
|
src = inspect.getsource(getattr(_CC, write_attr))
|
||||||
|
assert "_write_lock" in src, f"{write_attr} should acquire write lock"
|
||||||
|
|
||||||
|
for read_attr in ("query", "get", "count"):
|
||||||
|
method = getattr(_CC, read_attr, None)
|
||||||
|
if method is None:
|
||||||
|
continue
|
||||||
|
src = inspect.getsource(method)
|
||||||
|
assert (
|
||||||
|
"_write_lock" not in src
|
||||||
|
), f"{read_attr} must NOT acquire the write lock (read path)"
|
||||||
|
finally:
|
||||||
|
open(release, "w").close()
|
||||||
|
holder.join(timeout=5)
|
||||||
+206
-1
@@ -4,7 +4,7 @@ import argparse
|
|||||||
import shlex
|
import shlex
|
||||||
import sys
|
import sys
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from unittest.mock import MagicMock, patch
|
from unittest.mock import MagicMock, call, patch
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
@@ -776,6 +776,7 @@ def test_cmd_repair_error_reading(mock_config_cls, tmp_path, capsys):
|
|||||||
palace_dir.mkdir()
|
palace_dir.mkdir()
|
||||||
(palace_dir / "chroma.sqlite3").write_text("db")
|
(palace_dir / "chroma.sqlite3").write_text("db")
|
||||||
mock_config_cls.return_value.palace_path = str(palace_dir)
|
mock_config_cls.return_value.palace_path = str(palace_dir)
|
||||||
|
mock_config_cls.return_value.collection_name = "mempalace_drawers"
|
||||||
args = argparse.Namespace(palace=None)
|
args = argparse.Namespace(palace=None)
|
||||||
mock_backend = MagicMock()
|
mock_backend = MagicMock()
|
||||||
mock_backend.get_collection.side_effect = Exception("corrupt db")
|
mock_backend.get_collection.side_effect = Exception("corrupt db")
|
||||||
@@ -791,6 +792,7 @@ def test_cmd_repair_zero_drawers(mock_config_cls, tmp_path, capsys):
|
|||||||
palace_dir.mkdir()
|
palace_dir.mkdir()
|
||||||
(palace_dir / "chroma.sqlite3").write_text("db")
|
(palace_dir / "chroma.sqlite3").write_text("db")
|
||||||
mock_config_cls.return_value.palace_path = str(palace_dir)
|
mock_config_cls.return_value.palace_path = str(palace_dir)
|
||||||
|
mock_config_cls.return_value.collection_name = "mempalace_drawers"
|
||||||
args = argparse.Namespace(palace=None)
|
args = argparse.Namespace(palace=None)
|
||||||
mock_col = MagicMock()
|
mock_col = MagicMock()
|
||||||
mock_col.count.return_value = 0
|
mock_col.count.return_value = 0
|
||||||
@@ -807,6 +809,7 @@ def test_cmd_repair_success(mock_config_cls, tmp_path, capsys):
|
|||||||
palace_dir.mkdir()
|
palace_dir.mkdir()
|
||||||
(palace_dir / "chroma.sqlite3").write_text("db")
|
(palace_dir / "chroma.sqlite3").write_text("db")
|
||||||
mock_config_cls.return_value.palace_path = str(palace_dir)
|
mock_config_cls.return_value.palace_path = str(palace_dir)
|
||||||
|
mock_config_cls.return_value.collection_name = "mempalace_drawers"
|
||||||
args = argparse.Namespace(palace=None, yes=True)
|
args = argparse.Namespace(palace=None, yes=True)
|
||||||
mock_col = MagicMock()
|
mock_col = MagicMock()
|
||||||
mock_col.count.return_value = 2
|
mock_col.count.return_value = 2
|
||||||
@@ -815,13 +818,98 @@ def test_cmd_repair_success(mock_config_cls, tmp_path, capsys):
|
|||||||
"documents": ["doc1", "doc2"],
|
"documents": ["doc1", "doc2"],
|
||||||
"metadatas": [{"wing": "a"}, {"wing": "b"}],
|
"metadatas": [{"wing": "a"}, {"wing": "b"}],
|
||||||
}
|
}
|
||||||
|
mock_temp_col = MagicMock()
|
||||||
|
mock_temp_col.count.return_value = 2
|
||||||
mock_new_col = MagicMock()
|
mock_new_col = MagicMock()
|
||||||
|
mock_new_col.count.return_value = 2
|
||||||
mock_backend = _mock_backend_for(col=mock_col, new_col=mock_new_col)
|
mock_backend = _mock_backend_for(col=mock_col, new_col=mock_new_col)
|
||||||
|
mock_backend.create_collection.side_effect = [mock_temp_col, mock_new_col]
|
||||||
with patch("mempalace.backends.chroma.ChromaBackend", return_value=mock_backend):
|
with patch("mempalace.backends.chroma.ChromaBackend", return_value=mock_backend):
|
||||||
cmd_repair(args)
|
cmd_repair(args)
|
||||||
out = capsys.readouterr().out
|
out = capsys.readouterr().out
|
||||||
assert "Repair complete" in out
|
assert "Repair complete" in out
|
||||||
assert "2 drawers rebuilt" in out
|
assert "2 drawers rebuilt" in out
|
||||||
|
assert mock_backend.delete_collection.call_args_list == [
|
||||||
|
call(str(palace_dir), "mempalace_drawers__repair_tmp"),
|
||||||
|
call(str(palace_dir), "mempalace_drawers"),
|
||||||
|
call(str(palace_dir), "mempalace_drawers__repair_tmp"),
|
||||||
|
]
|
||||||
|
mock_temp_col.upsert.assert_called_once()
|
||||||
|
mock_new_col.upsert.assert_called_once()
|
||||||
|
mock_new_col.add.assert_not_called()
|
||||||
|
|
||||||
|
|
||||||
|
@patch("mempalace.cli.MempalaceConfig")
|
||||||
|
def test_cmd_repair_uses_configured_collection(mock_config_cls, tmp_path, capsys):
|
||||||
|
palace_dir = tmp_path / "palace"
|
||||||
|
palace_dir.mkdir()
|
||||||
|
(palace_dir / "chroma.sqlite3").write_text("db")
|
||||||
|
mock_config_cls.return_value.palace_path = str(palace_dir)
|
||||||
|
mock_config_cls.return_value.collection_name = "custom_drawers"
|
||||||
|
args = argparse.Namespace(palace=None, yes=True)
|
||||||
|
mock_col = MagicMock()
|
||||||
|
mock_col.count.return_value = 2
|
||||||
|
mock_col.get.return_value = {
|
||||||
|
"ids": ["id1", "id2"],
|
||||||
|
"documents": ["doc1", "doc2"],
|
||||||
|
"metadatas": [{"wing": "a"}, {"wing": "b"}],
|
||||||
|
}
|
||||||
|
mock_temp_col = MagicMock()
|
||||||
|
mock_temp_col.count.return_value = 2
|
||||||
|
mock_new_col = MagicMock()
|
||||||
|
mock_new_col.count.return_value = 2
|
||||||
|
mock_backend = _mock_backend_for(col=mock_col, new_col=mock_new_col)
|
||||||
|
mock_backend.create_collection.side_effect = [mock_temp_col, mock_new_col]
|
||||||
|
|
||||||
|
with patch("mempalace.backends.chroma.ChromaBackend", return_value=mock_backend):
|
||||||
|
cmd_repair(args)
|
||||||
|
|
||||||
|
out = capsys.readouterr().out
|
||||||
|
assert "Repair complete" in out
|
||||||
|
mock_backend.get_collection.assert_called_once_with(str(palace_dir), "custom_drawers")
|
||||||
|
assert mock_backend.create_collection.call_args_list == [
|
||||||
|
call(str(palace_dir), "custom_drawers__repair_tmp"),
|
||||||
|
call(str(palace_dir), "custom_drawers"),
|
||||||
|
]
|
||||||
|
assert mock_backend.delete_collection.call_args_list == [
|
||||||
|
call(str(palace_dir), "custom_drawers__repair_tmp"),
|
||||||
|
call(str(palace_dir), "custom_drawers"),
|
||||||
|
call(str(palace_dir), "custom_drawers__repair_tmp"),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
@patch("mempalace.cli.MempalaceConfig")
|
||||||
|
def test_cmd_repair_restores_backup_on_live_rebuild_failure(mock_config_cls, tmp_path, capsys):
|
||||||
|
palace_dir = tmp_path / "palace"
|
||||||
|
palace_dir.mkdir()
|
||||||
|
(palace_dir / "chroma.sqlite3").write_text("db")
|
||||||
|
mock_config_cls.return_value.palace_path = str(palace_dir)
|
||||||
|
mock_config_cls.return_value.collection_name = "mempalace_drawers"
|
||||||
|
args = argparse.Namespace(palace=None, yes=True)
|
||||||
|
mock_col = MagicMock()
|
||||||
|
mock_col.count.return_value = 2
|
||||||
|
mock_col.get.return_value = {
|
||||||
|
"ids": ["id1", "id2"],
|
||||||
|
"documents": ["doc1", "doc2"],
|
||||||
|
"metadatas": [{"wing": "a"}, {"wing": "b"}],
|
||||||
|
}
|
||||||
|
mock_temp_col = MagicMock()
|
||||||
|
mock_temp_col.count.return_value = 2
|
||||||
|
mock_backend = _mock_backend_for(col=mock_col)
|
||||||
|
mock_backend.create_collection.side_effect = [mock_temp_col, RuntimeError("live build failed")]
|
||||||
|
with patch("mempalace.backends.chroma.ChromaBackend", return_value=mock_backend):
|
||||||
|
with pytest.raises(SystemExit) as excinfo:
|
||||||
|
cmd_repair(args)
|
||||||
|
out = capsys.readouterr().out
|
||||||
|
assert excinfo.value.code == 1
|
||||||
|
assert "Repair failed" in out
|
||||||
|
assert "restoring from backup" in out
|
||||||
|
mock_backend.close_palace.assert_called_once_with(str(palace_dir))
|
||||||
|
assert mock_backend.delete_collection.call_args_list == [
|
||||||
|
call(str(palace_dir), "mempalace_drawers__repair_tmp"),
|
||||||
|
call(str(palace_dir), "mempalace_drawers"),
|
||||||
|
call(str(palace_dir), "mempalace_drawers__repair_tmp"),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
@patch("mempalace.cli.MempalaceConfig")
|
@patch("mempalace.cli.MempalaceConfig")
|
||||||
@@ -830,6 +918,7 @@ def test_cmd_repair_aborts_without_confirmation(mock_config_cls, tmp_path, capsy
|
|||||||
palace_dir.mkdir()
|
palace_dir.mkdir()
|
||||||
(palace_dir / "chroma.sqlite3").write_text("db")
|
(palace_dir / "chroma.sqlite3").write_text("db")
|
||||||
mock_config_cls.return_value.palace_path = str(palace_dir)
|
mock_config_cls.return_value.palace_path = str(palace_dir)
|
||||||
|
mock_config_cls.return_value.collection_name = "mempalace_drawers"
|
||||||
args = argparse.Namespace(palace=None)
|
args = argparse.Namespace(palace=None)
|
||||||
mock_col = MagicMock()
|
mock_col = MagicMock()
|
||||||
mock_col.count.return_value = 1
|
mock_col.count.return_value = 1
|
||||||
@@ -1042,3 +1131,119 @@ def test_cmd_repair_trailing_slash_does_not_recurse():
|
|||||||
palace_path = os.path.expanduser(args.palace).rstrip(os.sep)
|
palace_path = os.path.expanduser(args.palace).rstrip(os.sep)
|
||||||
backup_path = palace_path + ".backup"
|
backup_path = palace_path + ".backup"
|
||||||
assert not backup_path.startswith(palace_path + os.sep)
|
assert not backup_path.startswith(palace_path + os.sep)
|
||||||
|
|
||||||
|
|
||||||
|
# ── stdio reconfigure on Windows ─────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
class _ReconfigurableStringIO:
|
||||||
|
def __init__(self):
|
||||||
|
self.reconfigure_calls = []
|
||||||
|
|
||||||
|
def reconfigure(self, **kwargs):
|
||||||
|
self.reconfigure_calls.append(kwargs)
|
||||||
|
|
||||||
|
|
||||||
|
def test_reconfigures_stdio_to_utf8_on_windows():
|
||||||
|
"""Windows `mempalace` CLI must decode/encode stdio as UTF-8.
|
||||||
|
|
||||||
|
Without this, piped non-ASCII input (`mempalace search ... < q.txt`)
|
||||||
|
or piped non-ASCII output (`mempalace search "..." > out.txt`) is
|
||||||
|
mojibaked through the system ANSI codepage on non-Latin Windows
|
||||||
|
locales (cp1252/cp1251/cp950).
|
||||||
|
"""
|
||||||
|
from mempalace.cli import _reconfigure_stdio_utf8_on_windows
|
||||||
|
|
||||||
|
stdin = _ReconfigurableStringIO()
|
||||||
|
stdout = _ReconfigurableStringIO()
|
||||||
|
stderr = _ReconfigurableStringIO()
|
||||||
|
with (
|
||||||
|
patch.object(sys, "platform", "win32"),
|
||||||
|
patch.object(sys, "stdin", stdin),
|
||||||
|
patch.object(sys, "stdout", stdout),
|
||||||
|
patch.object(sys, "stderr", stderr),
|
||||||
|
):
|
||||||
|
_reconfigure_stdio_utf8_on_windows()
|
||||||
|
|
||||||
|
# Per-stream errors policy: stdin survives bad bytes via
|
||||||
|
# surrogateescape so a redirected non-UTF-8 file does not crash
|
||||||
|
# the read; stdout/stderr use replace so a drawer carrying a
|
||||||
|
# round-tripped surrogate half does not crash mid-print.
|
||||||
|
assert stdin.reconfigure_calls == [{"encoding": "utf-8", "errors": "surrogateescape"}]
|
||||||
|
assert stdout.reconfigure_calls == [{"encoding": "utf-8", "errors": "replace"}]
|
||||||
|
assert stderr.reconfigure_calls == [{"encoding": "utf-8", "errors": "replace"}]
|
||||||
|
|
||||||
|
|
||||||
|
def test_reconfigure_stdio_is_noop_off_windows():
|
||||||
|
"""Linux/macOS already default to UTF-8 stdio -- helper must not touch streams."""
|
||||||
|
from mempalace.cli import _reconfigure_stdio_utf8_on_windows
|
||||||
|
|
||||||
|
stdin = _ReconfigurableStringIO()
|
||||||
|
with (
|
||||||
|
patch.object(sys, "platform", "linux"),
|
||||||
|
patch.object(sys, "stdin", stdin),
|
||||||
|
):
|
||||||
|
_reconfigure_stdio_utf8_on_windows()
|
||||||
|
|
||||||
|
assert stdin.reconfigure_calls == []
|
||||||
|
|
||||||
|
|
||||||
|
# ── cmd_repair: from-sqlite mode exit codes ──────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
@patch("mempalace.cli.MempalaceConfig")
|
||||||
|
def test_cmd_repair_from_sqlite_validation_refusal_exits_nonzero(mock_config_cls, tmp_path, capsys):
|
||||||
|
"""When ``rebuild_from_sqlite`` returns ``{}`` for a validation
|
||||||
|
refusal (missing source DB, in-place without --archive-existing,
|
||||||
|
refusing to overwrite an existing dest), the CLI must surface a
|
||||||
|
non-zero exit so unattended scripts and CI distinguish "invalid
|
||||||
|
inputs" from "successful recovery that found zero rows."
|
||||||
|
|
||||||
|
Catches: a regression where the CLI treats the validation-refusal
|
||||||
|
sentinel as success, leaving CI green on a no-op repair that should
|
||||||
|
have alerted an operator.
|
||||||
|
"""
|
||||||
|
palace_dir = tmp_path / "palace"
|
||||||
|
palace_dir.mkdir()
|
||||||
|
mock_config_cls.return_value.palace_path = str(palace_dir)
|
||||||
|
|
||||||
|
args = argparse.Namespace(
|
||||||
|
palace=str(palace_dir),
|
||||||
|
mode="from-sqlite",
|
||||||
|
source=None,
|
||||||
|
archive_existing=False,
|
||||||
|
yes=True,
|
||||||
|
)
|
||||||
|
with patch("mempalace.repair.rebuild_from_sqlite", return_value={}):
|
||||||
|
with pytest.raises(SystemExit) as excinfo:
|
||||||
|
cmd_repair(args)
|
||||||
|
assert excinfo.value.code == 1
|
||||||
|
|
||||||
|
|
||||||
|
@patch("mempalace.cli.MempalaceConfig")
|
||||||
|
def test_cmd_repair_from_sqlite_success_does_not_exit(mock_config_cls, tmp_path):
|
||||||
|
"""A successful from-sqlite rebuild — even one that finds zero rows
|
||||||
|
in a legitimately empty source palace — must NOT call ``sys.exit``.
|
||||||
|
A populated counts dict (with ``0`` values) is the success signal;
|
||||||
|
only the empty dict ``{}`` is reserved for validation refusal.
|
||||||
|
|
||||||
|
Catches: a regression where ``if not counts`` is replaced by
|
||||||
|
``if not sum(counts.values())`` or similar, conflating "empty source"
|
||||||
|
with "validation refused" and breaking idempotent recovery scripts.
|
||||||
|
"""
|
||||||
|
palace_dir = tmp_path / "palace"
|
||||||
|
palace_dir.mkdir()
|
||||||
|
mock_config_cls.return_value.palace_path = str(palace_dir)
|
||||||
|
|
||||||
|
args = argparse.Namespace(
|
||||||
|
palace=str(palace_dir),
|
||||||
|
mode="from-sqlite",
|
||||||
|
source=None,
|
||||||
|
archive_existing=False,
|
||||||
|
yes=True,
|
||||||
|
)
|
||||||
|
# Zero rows but per-collection keys present → success, no exit.
|
||||||
|
fake_counts = {"mempalace_drawers": 0, "mempalace_closets": 0}
|
||||||
|
with patch("mempalace.repair.rebuild_from_sqlite", return_value=fake_counts):
|
||||||
|
# Should return cleanly; no SystemExit raised.
|
||||||
|
cmd_repair(args)
|
||||||
|
|||||||
@@ -296,6 +296,182 @@ class TestRegenerateClosets:
|
|||||||
assert meta.get("generated_by", "").startswith("llm:")
|
assert meta.get("generated_by", "").startswith("llm:")
|
||||||
assert meta.get("normalize_version") == NORMALIZE_VERSION
|
assert meta.get("normalize_version") == NORMALIZE_VERSION
|
||||||
|
|
||||||
|
def test_regen_paginates_drawer_fetch(self, tmp_path):
|
||||||
|
"""Regression for #1073: drawers_col.get must be paginated at
|
||||||
|
batch_size=5000. A single get(limit=total, ...) on a palace with
|
||||||
|
more than SQLite's SQLITE_MAX_VARIABLE_NUMBER (32766) drawers
|
||||||
|
blows up inside chromadb. Matches the miner.status pattern
|
||||||
|
introduced in #851 (see #802, #850, #1073)."""
|
||||||
|
from mempalace import closet_llm as closet_llm_mod
|
||||||
|
|
||||||
|
palace = str(tmp_path / "palace")
|
||||||
|
|
||||||
|
# Build a fake collection: 12_000 drawers across 3 source files,
|
||||||
|
# enough to force 3 batches of batch_size=5000 (5000 + 5000 + 2000).
|
||||||
|
n_drawers = 12_000
|
||||||
|
ids = [f"d{i:05d}" for i in range(n_drawers)]
|
||||||
|
docs = [f"doc body {i}" for i in range(n_drawers)]
|
||||||
|
metas = [
|
||||||
|
{
|
||||||
|
"wing": "w",
|
||||||
|
"room": "r",
|
||||||
|
"source_file": f"/src/file_{i % 3}.md",
|
||||||
|
"entities": "",
|
||||||
|
}
|
||||||
|
for i in range(n_drawers)
|
||||||
|
]
|
||||||
|
|
||||||
|
get_calls: list = []
|
||||||
|
|
||||||
|
class FakeDrawersCol:
|
||||||
|
def count(self):
|
||||||
|
return n_drawers
|
||||||
|
|
||||||
|
def get(self, limit=None, offset=0, include=None, **kwargs):
|
||||||
|
get_calls.append({"limit": limit, "offset": offset, "include": include})
|
||||||
|
end = min(offset + (limit or n_drawers), n_drawers)
|
||||||
|
return {
|
||||||
|
"ids": ids[offset:end],
|
||||||
|
"documents": docs[offset:end],
|
||||||
|
"metadatas": metas[offset:end],
|
||||||
|
}
|
||||||
|
|
||||||
|
class FakeClosetsCol:
|
||||||
|
"""Accept the purge + upsert calls the success path makes."""
|
||||||
|
|
||||||
|
def get(self, *a, **kw):
|
||||||
|
return {"ids": [], "documents": [], "metadatas": []}
|
||||||
|
|
||||||
|
def delete(self, *a, **kw):
|
||||||
|
return None
|
||||||
|
|
||||||
|
def upsert(self, *a, **kw):
|
||||||
|
return None
|
||||||
|
|
||||||
|
fake_drawers = FakeDrawersCol()
|
||||||
|
fake_closets = FakeClosetsCol()
|
||||||
|
|
||||||
|
def fake_urlopen(req, timeout=None):
|
||||||
|
return _FakeResp(
|
||||||
|
{
|
||||||
|
"choices": [
|
||||||
|
{"message": {"content": '{"topics":["t1"],"quotes":[],"summary":""}'}}
|
||||||
|
],
|
||||||
|
"usage": {"prompt_tokens": 1, "completion_tokens": 1},
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
cfg = LLMConfig(endpoint="http://local/v1", model="m")
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch.object(closet_llm_mod, "get_collection", return_value=fake_drawers),
|
||||||
|
patch.object(closet_llm_mod, "get_closets_collection", return_value=fake_closets),
|
||||||
|
patch.object(closet_llm_mod, "purge_file_closets", return_value=None),
|
||||||
|
patch.object(closet_llm_mod, "upsert_closet_lines", return_value=None),
|
||||||
|
patch("urllib.request.urlopen", side_effect=fake_urlopen),
|
||||||
|
):
|
||||||
|
result = regenerate_closets(palace, cfg=cfg, dry_run=True)
|
||||||
|
|
||||||
|
# Three paginated calls: (limit=5000, offset=0), (5000, 5000), (5000, 10000).
|
||||||
|
assert len(get_calls) == 3, f"expected 3 batched fetches, got {len(get_calls)}"
|
||||||
|
for call in get_calls:
|
||||||
|
assert (
|
||||||
|
call["limit"] == 5000
|
||||||
|
), f"batch must be 5000 — got {call['limit']} (would risk SQLITE_MAX_VARIABLE_NUMBER)"
|
||||||
|
# include must still request both documents and metadatas
|
||||||
|
assert "documents" in call["include"]
|
||||||
|
assert "metadatas" in call["include"]
|
||||||
|
assert [c["offset"] for c in get_calls] == [0, 5000, 10_000]
|
||||||
|
|
||||||
|
# by_source aggregation must be preserved exactly across batches:
|
||||||
|
# 12_000 drawers, 3 source files → 4_000 drawers each.
|
||||||
|
# dry_run=True short-circuits LLM calls but still walks by_source.
|
||||||
|
assert result.get("processed", 0) == 0 # dry_run
|
||||||
|
# Verify no single call tried to pull more than batch_size.
|
||||||
|
assert max(c["limit"] for c in get_calls) <= 5000
|
||||||
|
|
||||||
|
def test_regen_by_source_aggregates_across_batches(self, tmp_path):
|
||||||
|
"""Pagination must not change the by_source grouping — drawers for
|
||||||
|
the same source_file split across batches still land in one group."""
|
||||||
|
from mempalace import closet_llm as closet_llm_mod
|
||||||
|
|
||||||
|
palace = str(tmp_path / "palace")
|
||||||
|
|
||||||
|
# 7_500 drawers, alternating between two source files → forces
|
||||||
|
# splits across the 5000/2500 boundary. Each source ends up with
|
||||||
|
# 3_750 drawers after regrouping.
|
||||||
|
n_drawers = 7_500
|
||||||
|
ids = [f"d{i:05d}" for i in range(n_drawers)]
|
||||||
|
docs = [f"body-{i}" for i in range(n_drawers)]
|
||||||
|
metas = [
|
||||||
|
{
|
||||||
|
"wing": "w",
|
||||||
|
"room": "r",
|
||||||
|
"source_file": f"/src/file_{i % 2}.md",
|
||||||
|
"entities": "",
|
||||||
|
}
|
||||||
|
for i in range(n_drawers)
|
||||||
|
]
|
||||||
|
|
||||||
|
captured_sources: dict = {}
|
||||||
|
|
||||||
|
class FakeDrawersCol:
|
||||||
|
def count(self):
|
||||||
|
return n_drawers
|
||||||
|
|
||||||
|
def get(self, limit=None, offset=0, include=None, **kwargs):
|
||||||
|
end = min(offset + (limit or n_drawers), n_drawers)
|
||||||
|
return {
|
||||||
|
"ids": ids[offset:end],
|
||||||
|
"documents": docs[offset:end],
|
||||||
|
"metadatas": metas[offset:end],
|
||||||
|
}
|
||||||
|
|
||||||
|
class FakeClosetsCol:
|
||||||
|
def get(self, *a, **kw):
|
||||||
|
return {"ids": [], "documents": [], "metadatas": []}
|
||||||
|
|
||||||
|
def delete(self, *a, **kw):
|
||||||
|
return None
|
||||||
|
|
||||||
|
def upsert(self, *a, **kw):
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Hook _call_llm to inspect what regenerate_closets aggregated
|
||||||
|
# per source before the HTTP boundary.
|
||||||
|
real_call_llm = closet_llm_mod._call_llm
|
||||||
|
|
||||||
|
def spying_call_llm(cfg, source_file, wing, room, content):
|
||||||
|
captured_sources[source_file] = content
|
||||||
|
return (
|
||||||
|
{"topics": ["t"], "quotes": [], "summary": ""},
|
||||||
|
{"prompt_tokens": 1, "completion_tokens": 1},
|
||||||
|
)
|
||||||
|
|
||||||
|
cfg = LLMConfig(endpoint="http://local/v1", model="m")
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch.object(closet_llm_mod, "get_collection", return_value=FakeDrawersCol()),
|
||||||
|
patch.object(closet_llm_mod, "get_closets_collection", return_value=FakeClosetsCol()),
|
||||||
|
patch.object(closet_llm_mod, "purge_file_closets", return_value=None),
|
||||||
|
patch.object(closet_llm_mod, "upsert_closet_lines", return_value=None),
|
||||||
|
patch.object(closet_llm_mod, "_call_llm", side_effect=spying_call_llm),
|
||||||
|
):
|
||||||
|
regenerate_closets(palace, cfg=cfg)
|
||||||
|
|
||||||
|
# Both sources survived the pagination boundary.
|
||||||
|
assert set(captured_sources.keys()) == {"/src/file_0.md", "/src/file_1.md"}
|
||||||
|
# Each source accumulated exactly 3_750 drawer bodies, concatenated
|
||||||
|
# with the "\n\n" separator the regenerate path uses.
|
||||||
|
for source, content in captured_sources.items():
|
||||||
|
assert content.count("\n\n") == 3_749, (
|
||||||
|
f"{source}: expected 3_750 chunks joined (3_749 separators), "
|
||||||
|
f"got {content.count(chr(10) + chr(10)) + 1}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Silence unused-var lint.
|
||||||
|
assert real_call_llm is not None
|
||||||
|
|
||||||
def test_regen_uses_basename_not_split_slash(self, tmp_path, monkeypatch):
|
def test_regen_uses_basename_not_split_slash(self, tmp_path, monkeypatch):
|
||||||
"""Regression: the old closet_id base used ``source.split('/')[-1]``
|
"""Regression: the old closet_id base used ``source.split('/')[-1]``
|
||||||
which silently degrades on Windows paths (``C:\\proj\\a.md`` →
|
which silently degrades on Windows paths (``C:\\proj\\a.md`` →
|
||||||
|
|||||||
+73
-1
@@ -3,7 +3,13 @@ import json
|
|||||||
import tempfile
|
import tempfile
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
from mempalace.config import MempalaceConfig, normalize_wing_name, sanitize_kg_value, sanitize_name
|
from mempalace.config import (
|
||||||
|
MempalaceConfig,
|
||||||
|
normalize_wing_name,
|
||||||
|
sanitize_iso_date,
|
||||||
|
sanitize_kg_value,
|
||||||
|
sanitize_name,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_default_config():
|
def test_default_config():
|
||||||
@@ -212,3 +218,69 @@ def test_kg_value_rejects_null_bytes():
|
|||||||
def test_kg_value_rejects_over_length():
|
def test_kg_value_rejects_over_length():
|
||||||
with pytest.raises(ValueError):
|
with pytest.raises(ValueError):
|
||||||
sanitize_kg_value("a" * 129)
|
sanitize_kg_value("a" * 129)
|
||||||
|
|
||||||
|
|
||||||
|
# --- sanitize_iso_date ---
|
||||||
|
|
||||||
|
|
||||||
|
def test_iso_date_rejects_year_only():
|
||||||
|
# Partial dates re-introduce silent empty result sets via lexicographic
|
||||||
|
# TEXT comparison in KG queries (e.g. "2026-01-01" <= "2026" is False).
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
sanitize_iso_date("2026")
|
||||||
|
|
||||||
|
|
||||||
|
def test_iso_date_rejects_year_month():
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
sanitize_iso_date("2026-03")
|
||||||
|
|
||||||
|
|
||||||
|
def test_iso_date_accepts_full_date():
|
||||||
|
assert sanitize_iso_date("2026-03-15") == "2026-03-15"
|
||||||
|
|
||||||
|
|
||||||
|
def test_iso_date_passes_through_none():
|
||||||
|
assert sanitize_iso_date(None) is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_iso_date_passes_through_empty_string():
|
||||||
|
assert sanitize_iso_date("") == ""
|
||||||
|
|
||||||
|
|
||||||
|
def test_iso_date_strips_whitespace():
|
||||||
|
assert sanitize_iso_date(" 2026-03-15 ") == "2026-03-15"
|
||||||
|
|
||||||
|
|
||||||
|
def test_iso_date_rejects_natural_language():
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
sanitize_iso_date("March 2026")
|
||||||
|
|
||||||
|
|
||||||
|
def test_iso_date_rejects_abbreviated_month():
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
sanitize_iso_date("Jan 2025")
|
||||||
|
|
||||||
|
|
||||||
|
def test_iso_date_rejects_us_format():
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
sanitize_iso_date("03/15/2026")
|
||||||
|
|
||||||
|
|
||||||
|
def test_iso_date_rejects_invalid_month():
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
sanitize_iso_date("2026-13")
|
||||||
|
|
||||||
|
|
||||||
|
def test_iso_date_rejects_invalid_day():
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
sanitize_iso_date("2026-02-32")
|
||||||
|
|
||||||
|
|
||||||
|
def test_iso_date_rejects_non_string():
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
sanitize_iso_date(20260315)
|
||||||
|
|
||||||
|
|
||||||
|
def test_iso_date_error_names_field():
|
||||||
|
with pytest.raises(ValueError, match="valid_from"):
|
||||||
|
sanitize_iso_date("yesterday", "valid_from")
|
||||||
|
|||||||
@@ -2,6 +2,8 @@
|
|||||||
|
|
||||||
from unittest.mock import patch
|
from unittest.mock import patch
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
from mempalace.entity_registry import (
|
from mempalace.entity_registry import (
|
||||||
COMMON_ENGLISH_WORDS,
|
COMMON_ENGLISH_WORDS,
|
||||||
PERSON_CONTEXT_PATTERNS,
|
PERSON_CONTEXT_PATTERNS,
|
||||||
@@ -71,6 +73,50 @@ def test_save_creates_file(tmp_path):
|
|||||||
assert (tmp_path / "entity_registry.json").exists()
|
assert (tmp_path / "entity_registry.json").exists()
|
||||||
|
|
||||||
|
|
||||||
|
def test_save_is_atomic_does_not_leave_tmp(tmp_path):
|
||||||
|
# Atomic write must not leave the .tmp sidecar file after a successful save.
|
||||||
|
registry = EntityRegistry.load(config_dir=tmp_path)
|
||||||
|
registry.save()
|
||||||
|
leftover = list(tmp_path.glob("entity_registry.json.tmp*"))
|
||||||
|
assert leftover == [], f"atomic write leaked tmp file(s): {leftover}"
|
||||||
|
|
||||||
|
|
||||||
|
def test_save_preserves_previous_on_serialization_failure(tmp_path, monkeypatch):
|
||||||
|
# If serialization fails mid-write, the previous registry must remain
|
||||||
|
# intact — this is the whole point of atomic write vs truncating in place.
|
||||||
|
registry = EntityRegistry.load(config_dir=tmp_path)
|
||||||
|
registry.seed(
|
||||||
|
mode="personal",
|
||||||
|
people=[{"name": "Alice", "relationship": "friend", "context": "personal"}],
|
||||||
|
projects=[],
|
||||||
|
)
|
||||||
|
registry.save()
|
||||||
|
target = tmp_path / "entity_registry.json"
|
||||||
|
original = target.read_text(encoding="utf-8")
|
||||||
|
|
||||||
|
# Force os.replace to raise — simulates filesystem full / permission flip
|
||||||
|
# AFTER the temp file is written but BEFORE the rename completes.
|
||||||
|
import os as _os
|
||||||
|
|
||||||
|
real_replace = _os.replace
|
||||||
|
|
||||||
|
def boom(src, dst):
|
||||||
|
raise OSError("simulated rename failure")
|
||||||
|
|
||||||
|
monkeypatch.setattr(_os, "replace", boom)
|
||||||
|
with pytest.raises(OSError):
|
||||||
|
registry.seed(
|
||||||
|
mode="personal",
|
||||||
|
people=[{"name": "Bob", "relationship": "friend", "context": "personal"}],
|
||||||
|
projects=[],
|
||||||
|
)
|
||||||
|
registry.save()
|
||||||
|
|
||||||
|
# Restore os.replace before reading so the assertion can rely on it.
|
||||||
|
monkeypatch.setattr(_os, "replace", real_replace)
|
||||||
|
assert target.read_text(encoding="utf-8") == original
|
||||||
|
|
||||||
|
|
||||||
# ── seed ────────────────────────────────────────────────────────────────
|
# ── seed ────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -286,3 +286,66 @@ class TestCLI:
|
|||||||
assert "similar_name" in out
|
assert "similar_name" in out
|
||||||
# Silence unused import warning.
|
# Silence unused import warning.
|
||||||
_ = (MagicMock, patch, fact_checker)
|
_ = (MagicMock, patch, fact_checker)
|
||||||
|
|
||||||
|
def test_reconfigures_stdio_to_utf8_on_windows(self):
|
||||||
|
"""Windows fact_checker --stdin must decode payload as UTF-8.
|
||||||
|
|
||||||
|
Without this, Python defaults stdio to the system ANSI codepage
|
||||||
|
(cp1252/cp1251/cp950), which mojibakes non-ASCII text before
|
||||||
|
pattern parsing sees it.
|
||||||
|
"""
|
||||||
|
import io
|
||||||
|
import sys
|
||||||
|
|
||||||
|
from mempalace.fact_checker import _reconfigure_stdio_utf8_on_windows
|
||||||
|
|
||||||
|
class _ReconfigurableStringIO(io.StringIO):
|
||||||
|
def __init__(self, initial_value=""):
|
||||||
|
super().__init__(initial_value)
|
||||||
|
self.reconfigure_calls = []
|
||||||
|
|
||||||
|
def reconfigure(self, **kwargs):
|
||||||
|
self.reconfigure_calls.append(kwargs)
|
||||||
|
|
||||||
|
stdin = _ReconfigurableStringIO()
|
||||||
|
stdout = _ReconfigurableStringIO()
|
||||||
|
stderr = _ReconfigurableStringIO()
|
||||||
|
with (
|
||||||
|
patch.object(sys, "platform", "win32"),
|
||||||
|
patch.object(sys, "stdin", stdin),
|
||||||
|
patch.object(sys, "stdout", stdout),
|
||||||
|
patch.object(sys, "stderr", stderr),
|
||||||
|
):
|
||||||
|
_reconfigure_stdio_utf8_on_windows()
|
||||||
|
|
||||||
|
# Per-stream errors policy: stdin uses surrogateescape so a stray
|
||||||
|
# malformed byte from a redirected file does not crash the read,
|
||||||
|
# stdout/stderr use replace so an extracted fact carrying a
|
||||||
|
# surrogate half does not crash mid-print.
|
||||||
|
assert stdin.reconfigure_calls == [{"encoding": "utf-8", "errors": "surrogateescape"}]
|
||||||
|
assert stdout.reconfigure_calls == [{"encoding": "utf-8", "errors": "replace"}]
|
||||||
|
assert stderr.reconfigure_calls == [{"encoding": "utf-8", "errors": "replace"}]
|
||||||
|
|
||||||
|
def test_reconfigure_stdio_is_noop_off_windows(self):
|
||||||
|
"""Linux/macOS already default to UTF-8 stdio -- helper must not touch streams."""
|
||||||
|
import io
|
||||||
|
import sys
|
||||||
|
|
||||||
|
from mempalace.fact_checker import _reconfigure_stdio_utf8_on_windows
|
||||||
|
|
||||||
|
class _ReconfigurableStringIO(io.StringIO):
|
||||||
|
def __init__(self):
|
||||||
|
super().__init__()
|
||||||
|
self.reconfigure_calls = []
|
||||||
|
|
||||||
|
def reconfigure(self, **kwargs):
|
||||||
|
self.reconfigure_calls.append(kwargs)
|
||||||
|
|
||||||
|
stdin = _ReconfigurableStringIO()
|
||||||
|
with (
|
||||||
|
patch.object(sys, "platform", "linux"),
|
||||||
|
patch.object(sys, "stdin", stdin),
|
||||||
|
):
|
||||||
|
_reconfigure_stdio_utf8_on_windows()
|
||||||
|
|
||||||
|
assert stdin.reconfigure_calls == []
|
||||||
|
|||||||
+158
-4
@@ -238,14 +238,40 @@ def test_capacity_status_tolerates_flush_lag(tmp_path):
|
|||||||
assert info["status"] == "ok"
|
assert info["status"] == "ok"
|
||||||
|
|
||||||
|
|
||||||
def test_capacity_status_flags_unflushed_with_large_sqlite(tmp_path):
|
def test_capacity_status_does_not_flag_unflushed_with_large_sqlite(tmp_path):
|
||||||
"""No pickle + many sqlite rows is its own divergence signal."""
|
"""No pickle + many sqlite rows is inconclusive, not divergence."""
|
||||||
seg = "seg-noflush"
|
seg = "seg-noflush"
|
||||||
_seed_chroma_db(str(tmp_path), sqlite_count=10_000, segment_id=seg)
|
_seed_chroma_db(str(tmp_path), sqlite_count=10_000, segment_id=seg)
|
||||||
info = hnsw_capacity_status(str(tmp_path), COLLECTION)
|
info = hnsw_capacity_status(str(tmp_path), COLLECTION)
|
||||||
assert info["diverged"] is True
|
assert info["diverged"] is False
|
||||||
|
assert info["status"] == "unknown"
|
||||||
|
assert info["divergence"] is None
|
||||||
assert info["hnsw_count"] is None
|
assert info["hnsw_count"] is None
|
||||||
assert "never flushed" in info["message"]
|
assert "capacity unavailable" in info["message"]
|
||||||
|
assert "leaving vector search enabled" in info["message"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_mcp_probe_does_not_disable_vectors_for_unflushed_metadata(tmp_path, monkeypatch):
|
||||||
|
"""The MCP preflight must not route all searches to BM25 on this signal."""
|
||||||
|
from mempalace import mcp_server
|
||||||
|
|
||||||
|
seg = "seg-mcp-noflush"
|
||||||
|
_seed_chroma_db(str(tmp_path), sqlite_count=10_000, segment_id=seg)
|
||||||
|
|
||||||
|
class _Cfg:
|
||||||
|
palace_path = str(tmp_path)
|
||||||
|
collection_name = "mempalace_drawers"
|
||||||
|
|
||||||
|
monkeypatch.setattr(mcp_server, "_config", _Cfg())
|
||||||
|
monkeypatch.setattr(mcp_server, "_vector_disabled", True)
|
||||||
|
monkeypatch.setattr(mcp_server, "_vector_disabled_reason", "old divergence")
|
||||||
|
|
||||||
|
mcp_server._refresh_vector_disabled_flag()
|
||||||
|
|
||||||
|
assert mcp_server._vector_disabled is False
|
||||||
|
assert mcp_server._vector_disabled_reason == ""
|
||||||
|
assert mcp_server._vector_capacity_status["status"] == "unknown"
|
||||||
|
assert "leaving vector search enabled" in mcp_server._vector_capacity_status["message"]
|
||||||
|
|
||||||
|
|
||||||
def test_capacity_status_quiet_for_empty_palace(tmp_path):
|
def test_capacity_status_quiet_for_empty_palace(tmp_path):
|
||||||
@@ -372,6 +398,17 @@ def _seed_drawers(palace: str, segment_id: str, drawers: list[tuple[str, dict, s
|
|||||||
conn.close()
|
conn.close()
|
||||||
|
|
||||||
|
|
||||||
|
def _set_drawer_created_at(palace: str, timestamps: dict[int, str]) -> None:
|
||||||
|
db_path = os.path.join(palace, "chroma.sqlite3")
|
||||||
|
conn = sqlite3.connect(db_path)
|
||||||
|
try:
|
||||||
|
for emb_id, created_at in timestamps.items():
|
||||||
|
conn.execute("UPDATE embeddings SET created_at = ? WHERE id = ?", (created_at, emb_id))
|
||||||
|
conn.commit()
|
||||||
|
finally:
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
|
||||||
@pytest.fixture
|
@pytest.fixture
|
||||||
def palace_with_drawers(tmp_path):
|
def palace_with_drawers(tmp_path):
|
||||||
seg = "seg-bm25"
|
seg = "seg-bm25"
|
||||||
@@ -417,6 +454,122 @@ def test_bm25_fallback_filters_by_wing(palace_with_drawers):
|
|||||||
assert all(r["wing"] == "design" for r in out["results"])
|
assert all(r["wing"] == "design" for r in out["results"])
|
||||||
|
|
||||||
|
|
||||||
|
def test_bm25_fallback_applies_wing_before_fts_candidate_limit(tmp_path):
|
||||||
|
seg = "seg-bm25-fts-limit"
|
||||||
|
_seed_chroma_db(str(tmp_path), sqlite_count=0, segment_id=seg)
|
||||||
|
_seed_drawers(
|
||||||
|
str(tmp_path),
|
||||||
|
seg,
|
||||||
|
[
|
||||||
|
(
|
||||||
|
"shared token outside target wing",
|
||||||
|
{"wing": "ops", "room": "incidents", "source_file": "/x/ops.md"},
|
||||||
|
"d-1",
|
||||||
|
),
|
||||||
|
(
|
||||||
|
"shared token inside target wing",
|
||||||
|
{"wing": "project", "room": "diary", "source_file": "/x/project.md"},
|
||||||
|
"d-2",
|
||||||
|
),
|
||||||
|
],
|
||||||
|
)
|
||||||
|
|
||||||
|
out = _bm25_only_via_sqlite("shared token", str(tmp_path), wing="project", max_candidates=1)
|
||||||
|
|
||||||
|
assert out["total_before_filter"] == 1
|
||||||
|
assert len(out["results"]) == 1
|
||||||
|
assert out["results"][0]["wing"] == "project"
|
||||||
|
|
||||||
|
|
||||||
|
def test_bm25_fallback_applies_room_before_fts_candidate_limit(tmp_path):
|
||||||
|
seg = "seg-bm25-room-limit"
|
||||||
|
_seed_chroma_db(str(tmp_path), sqlite_count=0, segment_id=seg)
|
||||||
|
_seed_drawers(
|
||||||
|
str(tmp_path),
|
||||||
|
seg,
|
||||||
|
[
|
||||||
|
(
|
||||||
|
"shared token wrong room",
|
||||||
|
{"wing": "project", "room": "scratch", "source_file": "/x/scratch.md"},
|
||||||
|
"d-1",
|
||||||
|
),
|
||||||
|
(
|
||||||
|
"shared token right room",
|
||||||
|
{"wing": "project", "room": "diary", "source_file": "/x/diary.md"},
|
||||||
|
"d-2",
|
||||||
|
),
|
||||||
|
],
|
||||||
|
)
|
||||||
|
|
||||||
|
out = _bm25_only_via_sqlite(
|
||||||
|
"shared token",
|
||||||
|
str(tmp_path),
|
||||||
|
wing="project",
|
||||||
|
room="diary",
|
||||||
|
max_candidates=1,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert out["total_before_filter"] == 1
|
||||||
|
assert len(out["results"]) == 1
|
||||||
|
assert out["results"][0]["wing"] == "project"
|
||||||
|
assert out["results"][0]["room"] == "diary"
|
||||||
|
|
||||||
|
|
||||||
|
def test_bm25_fallback_applies_wing_before_recency_candidate_limit(tmp_path):
|
||||||
|
seg = "seg-bm25-recency-limit"
|
||||||
|
_seed_chroma_db(str(tmp_path), sqlite_count=0, segment_id=seg)
|
||||||
|
_seed_drawers(
|
||||||
|
str(tmp_path),
|
||||||
|
seg,
|
||||||
|
[
|
||||||
|
(
|
||||||
|
"target drawer for short query",
|
||||||
|
{"wing": "project", "room": "diary", "source_file": "/x/project.md"},
|
||||||
|
"d-1",
|
||||||
|
),
|
||||||
|
(
|
||||||
|
"newer drawer outside target wing",
|
||||||
|
{"wing": "ops", "room": "incidents", "source_file": "/x/ops.md"},
|
||||||
|
"d-2",
|
||||||
|
),
|
||||||
|
],
|
||||||
|
)
|
||||||
|
_set_drawer_created_at(
|
||||||
|
str(tmp_path),
|
||||||
|
{
|
||||||
|
1: "2026-01-01 00:00:00",
|
||||||
|
2: "2026-02-01 00:00:00",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
out = _bm25_only_via_sqlite("a", str(tmp_path), wing="project", max_candidates=1)
|
||||||
|
|
||||||
|
assert out["total_before_filter"] == 1
|
||||||
|
assert len(out["results"]) == 1
|
||||||
|
assert out["results"][0]["wing"] == "project"
|
||||||
|
|
||||||
|
|
||||||
|
def test_bm25_fallback_returns_empty_when_filtered_wing_has_no_candidates(tmp_path):
|
||||||
|
seg = "seg-bm25-empty-filter"
|
||||||
|
_seed_chroma_db(str(tmp_path), sqlite_count=0, segment_id=seg)
|
||||||
|
_seed_drawers(
|
||||||
|
str(tmp_path),
|
||||||
|
seg,
|
||||||
|
[
|
||||||
|
(
|
||||||
|
"shared token outside target wing",
|
||||||
|
{"wing": "ops", "room": "incidents", "source_file": "/x/ops.md"},
|
||||||
|
"d-1",
|
||||||
|
),
|
||||||
|
],
|
||||||
|
)
|
||||||
|
|
||||||
|
out = _bm25_only_via_sqlite("shared token", str(tmp_path), wing="project", max_candidates=1)
|
||||||
|
|
||||||
|
assert out["total_before_filter"] == 0
|
||||||
|
assert out["results"] == []
|
||||||
|
|
||||||
|
|
||||||
def test_bm25_fallback_no_palace(tmp_path):
|
def test_bm25_fallback_no_palace(tmp_path):
|
||||||
out = _bm25_only_via_sqlite("anything", str(tmp_path))
|
out = _bm25_only_via_sqlite("anything", str(tmp_path))
|
||||||
assert "error" in out
|
assert "error" in out
|
||||||
@@ -473,6 +626,7 @@ def test_tool_status_via_sqlite_returns_breakdown(palace_with_drawers, monkeypat
|
|||||||
# MempalaceConfig.
|
# MempalaceConfig.
|
||||||
class _Cfg:
|
class _Cfg:
|
||||||
palace_path = str(palace_with_drawers)
|
palace_path = str(palace_with_drawers)
|
||||||
|
collection_name = "mempalace_drawers"
|
||||||
|
|
||||||
monkeypatch.setattr(mcp_server, "_config", _Cfg())
|
monkeypatch.setattr(mcp_server, "_config", _Cfg())
|
||||||
monkeypatch.setattr(mcp_server, "_vector_disabled", True)
|
monkeypatch.setattr(mcp_server, "_vector_disabled", True)
|
||||||
|
|||||||
@@ -0,0 +1,113 @@
|
|||||||
|
import os
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from mempalace.backends.chroma import (
|
||||||
|
_HNSW_LINK_TO_DATA_MAX_RATIO,
|
||||||
|
_hnsw_link_to_data_ratio,
|
||||||
|
_segment_appears_healthy,
|
||||||
|
quarantine_stale_hnsw,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _write_segment(
|
||||||
|
seg_dir: Path,
|
||||||
|
*,
|
||||||
|
data_size: int = 100,
|
||||||
|
link_size: int = 100,
|
||||||
|
write_metadata: bool = True,
|
||||||
|
) -> None:
|
||||||
|
seg_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
(seg_dir / "data_level0.bin").write_bytes(b"\0" * data_size)
|
||||||
|
(seg_dir / "link_lists.bin").write_bytes(b"\0" * link_size)
|
||||||
|
|
||||||
|
if write_metadata:
|
||||||
|
# Enough bytes to pass the existing pickle envelope sniff-test:
|
||||||
|
# starts with pickle protocol marker 0x80 and ends with STOP 0x2e.
|
||||||
|
(seg_dir / "index_metadata.pickle").write_bytes(b"\x80" + b"x" * 16 + b"\x2e")
|
||||||
|
|
||||||
|
|
||||||
|
def test_hnsw_link_to_data_ratio_reports_payload_size_ratio(tmp_path):
|
||||||
|
seg_dir = tmp_path / "11111111-2222-3333-4444-555555555555"
|
||||||
|
_write_segment(seg_dir, data_size=100, link_size=250)
|
||||||
|
|
||||||
|
assert _hnsw_link_to_data_ratio(str(seg_dir)) == 2.5
|
||||||
|
|
||||||
|
|
||||||
|
def test_segment_health_rejects_exploded_link_lists_even_with_valid_pickle(tmp_path):
|
||||||
|
seg_dir = tmp_path / "11111111-2222-3333-4444-555555555555"
|
||||||
|
_write_segment(
|
||||||
|
seg_dir,
|
||||||
|
data_size=100,
|
||||||
|
link_size=int(100 * (_HNSW_LINK_TO_DATA_MAX_RATIO + 1)),
|
||||||
|
write_metadata=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert not _segment_appears_healthy(str(seg_dir))
|
||||||
|
|
||||||
|
|
||||||
|
def test_segment_health_keeps_reasonable_payload_with_valid_pickle(tmp_path):
|
||||||
|
seg_dir = tmp_path / "11111111-2222-3333-4444-555555555555"
|
||||||
|
_write_segment(
|
||||||
|
seg_dir,
|
||||||
|
data_size=100,
|
||||||
|
link_size=int(100 * _HNSW_LINK_TO_DATA_MAX_RATIO),
|
||||||
|
write_metadata=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert _segment_appears_healthy(str(seg_dir))
|
||||||
|
|
||||||
|
|
||||||
|
def test_quarantine_catches_link_bloat_without_mtime_drift(tmp_path):
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
palace.mkdir()
|
||||||
|
|
||||||
|
db_path = palace / "chroma.sqlite3"
|
||||||
|
db_path.write_text("sqlite placeholder")
|
||||||
|
|
||||||
|
seg_dir = palace / "11111111-2222-3333-4444-555555555555"
|
||||||
|
_write_segment(
|
||||||
|
seg_dir,
|
||||||
|
data_size=100,
|
||||||
|
link_size=int(100 * (_HNSW_LINK_TO_DATA_MAX_RATIO + 1)),
|
||||||
|
write_metadata=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Make sqlite and HNSW mtimes identical. The old mtime-only gate would
|
||||||
|
# skip this segment even though the payload is structurally corrupt.
|
||||||
|
same_time = 1_700_000_000
|
||||||
|
os.utime(db_path, (same_time, same_time))
|
||||||
|
os.utime(seg_dir / "data_level0.bin", (same_time, same_time))
|
||||||
|
|
||||||
|
moved = quarantine_stale_hnsw(str(palace), stale_seconds=999_999)
|
||||||
|
|
||||||
|
assert len(moved) == 1
|
||||||
|
assert not seg_dir.exists()
|
||||||
|
|
||||||
|
moved_path = Path(moved[0])
|
||||||
|
assert moved_path.exists()
|
||||||
|
assert moved_path.name.startswith("11111111-2222-3333-4444-555555555555.drift-")
|
||||||
|
|
||||||
|
|
||||||
|
def test_quarantine_leaves_reasonable_payload_in_place(tmp_path):
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
palace.mkdir()
|
||||||
|
|
||||||
|
db_path = palace / "chroma.sqlite3"
|
||||||
|
db_path.write_text("sqlite placeholder")
|
||||||
|
|
||||||
|
seg_dir = palace / "11111111-2222-3333-4444-555555555555"
|
||||||
|
_write_segment(
|
||||||
|
seg_dir,
|
||||||
|
data_size=100,
|
||||||
|
link_size=100,
|
||||||
|
write_metadata=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
same_time = 1_700_000_000
|
||||||
|
os.utime(db_path, (same_time, same_time))
|
||||||
|
os.utime(seg_dir / "data_level0.bin", (same_time, same_time))
|
||||||
|
|
||||||
|
moved = quarantine_stale_hnsw(str(palace), stale_seconds=999_999)
|
||||||
|
|
||||||
|
assert moved == []
|
||||||
|
assert seg_dir.exists()
|
||||||
@@ -8,6 +8,7 @@ from unittest.mock import MagicMock, patch
|
|||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
|
import mempalace.hooks_cli as hooks_cli_mod
|
||||||
from mempalace.hooks_cli import (
|
from mempalace.hooks_cli import (
|
||||||
SAVE_INTERVAL,
|
SAVE_INTERVAL,
|
||||||
_count_human_messages,
|
_count_human_messages,
|
||||||
@@ -959,3 +960,108 @@ def test_stop_hook_rejects_injected_stop_hook_active(tmp_path):
|
|||||||
# The injected value is not "true"/"1"/"yes", so the hook should NOT pass through.
|
# The injected value is not "true"/"1"/"yes", so the hook should NOT pass through.
|
||||||
# Save must have been attempted.
|
# Save must have been attempted.
|
||||||
assert mock_save.called
|
assert mock_save.called
|
||||||
|
|
||||||
|
|
||||||
|
# --- Absent palace root: hooks must not recreate ~/.mempalace ---
|
||||||
|
#
|
||||||
|
# When the user removes ~/.mempalace (e.g. `rm -rf`), that is the strongest
|
||||||
|
# possible "do not auto-capture" signal. Hooks must short-circuit BEFORE
|
||||||
|
# touching disk — including before the log-line that previously triggered
|
||||||
|
# STATE_DIR.mkdir() on its own.
|
||||||
|
|
||||||
|
|
||||||
|
def _redirect_palace_root(monkeypatch, tmp_path):
|
||||||
|
"""Point PALACE_ROOT and STATE_DIR at a tmp location that does NOT exist."""
|
||||||
|
fake_root = tmp_path / "absent-mempalace"
|
||||||
|
monkeypatch.setattr(hooks_cli_mod, "PALACE_ROOT", fake_root)
|
||||||
|
monkeypatch.setattr(hooks_cli_mod, "STATE_DIR", fake_root / "hook_state")
|
||||||
|
monkeypatch.setattr(hooks_cli_mod, "_state_dir_initialized", False)
|
||||||
|
return fake_root
|
||||||
|
|
||||||
|
|
||||||
|
def test_hook_stop_does_not_create_palace_dir_when_absent(tmp_path, monkeypatch):
|
||||||
|
fake_root = _redirect_palace_root(monkeypatch, tmp_path)
|
||||||
|
transcript = tmp_path / "t.jsonl"
|
||||||
|
transcript.write_text("")
|
||||||
|
buf = io.StringIO()
|
||||||
|
with contextlib.redirect_stdout(buf):
|
||||||
|
hook_stop(
|
||||||
|
{"session_id": "absent", "transcript_path": str(transcript), "stop_hook_active": False},
|
||||||
|
"claude-code",
|
||||||
|
)
|
||||||
|
assert json.loads(buf.getvalue() or "{}") == {}
|
||||||
|
assert not fake_root.exists()
|
||||||
|
|
||||||
|
|
||||||
|
def test_hook_precompact_does_not_create_palace_dir_when_absent(tmp_path, monkeypatch):
|
||||||
|
fake_root = _redirect_palace_root(monkeypatch, tmp_path)
|
||||||
|
transcript = tmp_path / "t.jsonl"
|
||||||
|
transcript.write_text("")
|
||||||
|
buf = io.StringIO()
|
||||||
|
with contextlib.redirect_stdout(buf):
|
||||||
|
hook_precompact(
|
||||||
|
{"session_id": "absent", "transcript_path": str(transcript)},
|
||||||
|
"claude-code",
|
||||||
|
)
|
||||||
|
assert json.loads(buf.getvalue() or "{}") == {}
|
||||||
|
assert not fake_root.exists()
|
||||||
|
|
||||||
|
|
||||||
|
def test_hook_session_start_does_not_create_palace_dir_when_absent(tmp_path, monkeypatch):
|
||||||
|
fake_root = _redirect_palace_root(monkeypatch, tmp_path)
|
||||||
|
buf = io.StringIO()
|
||||||
|
with contextlib.redirect_stdout(buf):
|
||||||
|
hook_session_start({"session_id": "absent"}, "claude-code")
|
||||||
|
assert json.loads(buf.getvalue() or "{}") == {}
|
||||||
|
assert not fake_root.exists()
|
||||||
|
|
||||||
|
|
||||||
|
def test_log_does_not_create_palace_dir_when_absent(tmp_path, monkeypatch):
|
||||||
|
fake_root = _redirect_palace_root(monkeypatch, tmp_path)
|
||||||
|
_log("test message")
|
||||||
|
assert not fake_root.exists()
|
||||||
|
|
||||||
|
|
||||||
|
def test_existing_dir_proceeds_normally(tmp_path, monkeypatch):
|
||||||
|
"""Regression: when PALACE_ROOT exists, hooks must proceed (no short-circuit)."""
|
||||||
|
fake_root = tmp_path / "present-mempalace"
|
||||||
|
fake_root.mkdir()
|
||||||
|
monkeypatch.setattr(hooks_cli_mod, "PALACE_ROOT", fake_root)
|
||||||
|
monkeypatch.setattr(hooks_cli_mod, "STATE_DIR", fake_root / "hook_state")
|
||||||
|
monkeypatch.setattr(hooks_cli_mod, "_state_dir_initialized", False)
|
||||||
|
_log("test message")
|
||||||
|
# _log should have created the state dir under the existing palace root
|
||||||
|
assert (fake_root / "hook_state").exists()
|
||||||
|
assert (fake_root / "hook_state" / "hook.log").is_file()
|
||||||
|
|
||||||
|
|
||||||
|
def test_regular_file_at_palace_root_treated_as_absent(tmp_path, monkeypatch):
|
||||||
|
"""A regular file at ~/.mempalace must be treated the same as absent.
|
||||||
|
|
||||||
|
``Path.exists()`` returns True for a regular file, which would let the
|
||||||
|
kill-switch be bypassed and crash later when ``STATE_DIR.mkdir()`` runs
|
||||||
|
on ``NotADirectoryError``. ``_palace_root_exists()`` must use
|
||||||
|
``is_dir()`` so a stray file (or broken symlink) short-circuits cleanly.
|
||||||
|
"""
|
||||||
|
fake_root = tmp_path / "file-not-dir"
|
||||||
|
fake_root.write_text("oops, this is a file not a directory")
|
||||||
|
monkeypatch.setattr(hooks_cli_mod, "PALACE_ROOT", fake_root)
|
||||||
|
monkeypatch.setattr(hooks_cli_mod, "STATE_DIR", fake_root / "hook_state")
|
||||||
|
monkeypatch.setattr(hooks_cli_mod, "_state_dir_initialized", False)
|
||||||
|
|
||||||
|
# _palace_root_exists() is the source of truth — it must return False.
|
||||||
|
assert hooks_cli_mod._palace_root_exists() is False
|
||||||
|
|
||||||
|
# Hooks must short-circuit (return {} on stdout) and not touch disk.
|
||||||
|
buf = io.StringIO()
|
||||||
|
with contextlib.redirect_stdout(buf):
|
||||||
|
hook_session_start({"session_id": "file-at-root"}, "claude-code")
|
||||||
|
assert json.loads(buf.getvalue() or "{}") == {}
|
||||||
|
|
||||||
|
# _log must also short-circuit — it must NOT try to mkdir a path under a
|
||||||
|
# regular file (which would raise NotADirectoryError).
|
||||||
|
_log("test message") # would raise if not short-circuited
|
||||||
|
|
||||||
|
# The stray file is left untouched; we never try to convert it.
|
||||||
|
assert fake_root.is_file()
|
||||||
|
assert fake_root.read_text() == "oops, this is a file not a directory"
|
||||||
|
|||||||
@@ -5,6 +5,8 @@ Covers: entity CRUD, triple CRUD, temporal queries, invalidation,
|
|||||||
timeline, stats, and edge cases (duplicate triples, ID collisions).
|
timeline, stats, and edge cases (duplicate triples, ID collisions).
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
|
||||||
class TestEntityOperations:
|
class TestEntityOperations:
|
||||||
def test_add_entity(self, kg):
|
def test_add_entity(self, kg):
|
||||||
@@ -45,6 +47,38 @@ class TestTripleOperations:
|
|||||||
tid2 = kg.add_triple("Alice", "works_at", "Acme")
|
tid2 = kg.add_triple("Alice", "works_at", "Acme")
|
||||||
assert tid1 != tid2 # new triple since old one was closed
|
assert tid1 != tid2 # new triple since old one was closed
|
||||||
|
|
||||||
|
def test_add_triple_rejects_inverted_interval(self, kg):
|
||||||
|
# valid_to before valid_from would never satisfy
|
||||||
|
# `valid_from <= as_of AND valid_to >= as_of` — silently invisible
|
||||||
|
# to every query. Reject at write time instead.
|
||||||
|
with pytest.raises(ValueError, match="before valid_from"):
|
||||||
|
kg.add_triple(
|
||||||
|
"Alice",
|
||||||
|
"worked_at",
|
||||||
|
"Acme",
|
||||||
|
valid_from="2026-03-01",
|
||||||
|
valid_to="2026-02-01",
|
||||||
|
)
|
||||||
|
|
||||||
|
def test_add_triple_accepts_equal_dates(self, kg):
|
||||||
|
# Same-day intervals are valid (point-in-time facts).
|
||||||
|
tid = kg.add_triple(
|
||||||
|
"Alice",
|
||||||
|
"joined",
|
||||||
|
"Acme",
|
||||||
|
valid_from="2026-03-15",
|
||||||
|
valid_to="2026-03-15",
|
||||||
|
)
|
||||||
|
assert tid.startswith("t_alice_joined_acme_")
|
||||||
|
|
||||||
|
def test_add_triple_allows_only_one_bound(self, kg):
|
||||||
|
# The guard only fires when BOTH bounds are set.
|
||||||
|
tid1 = kg.add_triple("Alice", "knows", "Bob", valid_from="2026-01-01")
|
||||||
|
assert tid1.startswith("t_alice_knows_bob_")
|
||||||
|
kg.invalidate("Alice", "knows", "Bob", ended="2026-02-01")
|
||||||
|
tid2 = kg.add_triple("Alice", "knew", "Bob", valid_to="2026-03-01")
|
||||||
|
assert tid2.startswith("t_alice_knew_bob_")
|
||||||
|
|
||||||
|
|
||||||
class TestQueries:
|
class TestQueries:
|
||||||
def test_query_outgoing(self, seeded_kg):
|
def test_query_outgoing(self, seeded_kg):
|
||||||
|
|||||||
@@ -655,3 +655,72 @@ def test_memory_stack_status_with_palace(tmp_path):
|
|||||||
|
|
||||||
assert result["total_drawers"] == 42
|
assert result["total_drawers"] == 42
|
||||||
assert result["L0_identity"]["exists"] is True
|
assert result["L0_identity"]["exists"] is True
|
||||||
|
|
||||||
|
|
||||||
|
# ── Layer1 / Layer2 None-metadata guards ───────────────────────────────
|
||||||
|
#
|
||||||
|
# Chroma 1.5.x can return ``None`` inside the ``metadatas`` / ``documents``
|
||||||
|
# lists for partially-flushed rows. The Layer1.generate() and
|
||||||
|
# Layer2.retrieve() loops previously called ``meta.get(...)`` without
|
||||||
|
# coercing, raising ``AttributeError: 'NoneType' object has no attribute
|
||||||
|
# 'get'`` and blowing up the whole wake-up render. These tests guard that
|
||||||
|
# the loops tolerate the None entries and render the rest of the result.
|
||||||
|
|
||||||
|
|
||||||
|
def test_layer1_handles_none_metadata():
|
||||||
|
"""Layer1.generate tolerates None entries in the metadatas list."""
|
||||||
|
docs = ["important memory", "another memory"]
|
||||||
|
metas = [{"room": "decisions", "source_file": "a.txt"}, None]
|
||||||
|
mock_col = _mock_chromadb_for_layer(docs, metas)
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("mempalace.layers.MempalaceConfig") as mock_cfg,
|
||||||
|
patch("mempalace.layers._get_collection", return_value=mock_col),
|
||||||
|
):
|
||||||
|
mock_cfg.return_value.palace_path = "/fake"
|
||||||
|
layer = Layer1(palace_path="/fake")
|
||||||
|
# Should not raise AttributeError on the None entry.
|
||||||
|
result = layer.generate()
|
||||||
|
|
||||||
|
assert "ESSENTIAL STORY" in result
|
||||||
|
assert "important memory" in result
|
||||||
|
|
||||||
|
|
||||||
|
def test_layer1_handles_none_document():
|
||||||
|
"""Layer1.generate tolerates None entries in the documents list."""
|
||||||
|
docs = ["first doc", None]
|
||||||
|
metas = [
|
||||||
|
{"room": "r", "source_file": "a.txt"},
|
||||||
|
{"room": "r", "source_file": "b.txt"},
|
||||||
|
]
|
||||||
|
mock_col = _mock_chromadb_for_layer(docs, metas)
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("mempalace.layers.MempalaceConfig") as mock_cfg,
|
||||||
|
patch("mempalace.layers._get_collection", return_value=mock_col),
|
||||||
|
):
|
||||||
|
mock_cfg.return_value.palace_path = "/fake"
|
||||||
|
layer = Layer1(palace_path="/fake")
|
||||||
|
result = layer.generate()
|
||||||
|
|
||||||
|
assert result # Render succeeded despite the None document.
|
||||||
|
|
||||||
|
|
||||||
|
def test_layer2_handles_none_metadata():
|
||||||
|
"""Layer2.retrieve tolerates None entries in the metadatas list."""
|
||||||
|
mock_col = MagicMock()
|
||||||
|
mock_col.get.return_value = {
|
||||||
|
"documents": ["first doc", "second doc"],
|
||||||
|
"metadatas": [{"room": "r", "source_file": "a.txt"}, None],
|
||||||
|
}
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("mempalace.layers.MempalaceConfig") as mock_cfg,
|
||||||
|
patch("mempalace.layers._get_collection", return_value=mock_col),
|
||||||
|
):
|
||||||
|
mock_cfg.return_value.palace_path = "/fake"
|
||||||
|
layer = Layer2(palace_path="/fake")
|
||||||
|
# Should not raise AttributeError on the None entry.
|
||||||
|
result = layer.retrieve()
|
||||||
|
|
||||||
|
assert "L2 — ON-DEMAND" in result
|
||||||
|
|||||||
+448
-1
@@ -8,7 +8,9 @@ via monkeypatch to avoid touching real data.
|
|||||||
|
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
import json
|
import json
|
||||||
|
import os
|
||||||
import sys
|
import sys
|
||||||
|
from unittest.mock import MagicMock
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
@@ -18,7 +20,7 @@ def _patch_mcp_server(monkeypatch, config, kg):
|
|||||||
from mempalace import mcp_server
|
from mempalace import mcp_server
|
||||||
|
|
||||||
monkeypatch.setattr(mcp_server, "_config", config)
|
monkeypatch.setattr(mcp_server, "_config", config)
|
||||||
monkeypatch.setattr(mcp_server, "_kg", kg)
|
monkeypatch.setattr(mcp_server, "_get_kg", lambda: kg)
|
||||||
|
|
||||||
|
|
||||||
def _get_collection(palace_path, create=False):
|
def _get_collection(palace_path, create=False):
|
||||||
@@ -146,6 +148,20 @@ class TestHandleRequest:
|
|||||||
)
|
)
|
||||||
assert resp["error"]["code"] == -32601
|
assert resp["error"]["code"] == -32601
|
||||||
|
|
||||||
|
def test_tools_call_missing_params(self):
|
||||||
|
from mempalace.mcp_server import handle_request
|
||||||
|
|
||||||
|
for bad_params in [None, {}, {"arguments": {}}]:
|
||||||
|
resp = handle_request(
|
||||||
|
{
|
||||||
|
"method": "tools/call",
|
||||||
|
"id": 15,
|
||||||
|
"params": bad_params,
|
||||||
|
}
|
||||||
|
)
|
||||||
|
assert resp["error"]["code"] == -32602
|
||||||
|
assert "Invalid params" in resp["error"]["message"]
|
||||||
|
|
||||||
def test_unknown_method(self):
|
def test_unknown_method(self):
|
||||||
from mempalace.mcp_server import handle_request
|
from mempalace.mcp_server import handle_request
|
||||||
|
|
||||||
@@ -188,6 +204,17 @@ class TestHandleRequest:
|
|||||||
resp = handle_request({"method": None, "id": 99, "params": {}})
|
resp = handle_request({"method": None, "id": 99, "params": {}})
|
||||||
assert resp["error"]["code"] == -32601
|
assert resp["error"]["code"] == -32601
|
||||||
|
|
||||||
|
@pytest.mark.parametrize("payload", [None, [], "plain", 42, True])
|
||||||
|
def test_handle_request_invalid_payload_returns_jsonrpc_error(self, payload):
|
||||||
|
from mempalace.mcp_server import handle_request
|
||||||
|
|
||||||
|
resp = handle_request(payload)
|
||||||
|
assert resp == {
|
||||||
|
"jsonrpc": "2.0",
|
||||||
|
"id": None,
|
||||||
|
"error": {"code": -32600, "message": "Invalid Request"},
|
||||||
|
}
|
||||||
|
|
||||||
def test_tools_call_dispatches(self, monkeypatch, config, palace_path, seeded_kg):
|
def test_tools_call_dispatches(self, monkeypatch, config, palace_path, seeded_kg):
|
||||||
_patch_mcp_server(monkeypatch, config, seeded_kg)
|
_patch_mcp_server(monkeypatch, config, seeded_kg)
|
||||||
from mempalace.mcp_server import handle_request
|
from mempalace.mcp_server import handle_request
|
||||||
@@ -457,6 +484,26 @@ class TestWriteTools:
|
|||||||
assert result2["success"] is True
|
assert result2["success"] is True
|
||||||
assert result2["reason"] == "already_exists"
|
assert result2["reason"] == "already_exists"
|
||||||
|
|
||||||
|
def test_add_drawer_fails_when_readback_misses(self, monkeypatch, config, kg):
|
||||||
|
_patch_mcp_server(monkeypatch, config, kg)
|
||||||
|
from mempalace import mcp_server
|
||||||
|
|
||||||
|
class _FakeGetResult:
|
||||||
|
ids = []
|
||||||
|
|
||||||
|
class _FakeCol:
|
||||||
|
def get(self, **kwargs):
|
||||||
|
return _FakeGetResult()
|
||||||
|
|
||||||
|
def upsert(self, **kwargs):
|
||||||
|
return None
|
||||||
|
|
||||||
|
monkeypatch.setattr(mcp_server, "_get_collection", lambda create=False: _FakeCol())
|
||||||
|
|
||||||
|
result = mcp_server.tool_add_drawer("w", "r", "content")
|
||||||
|
assert result["success"] is False
|
||||||
|
assert "not readable" in result["error"]
|
||||||
|
|
||||||
def test_add_drawer_shared_header_no_collision(self, monkeypatch, config, palace_path, kg):
|
def test_add_drawer_shared_header_no_collision(self, monkeypatch, config, palace_path, kg):
|
||||||
"""Documents sharing a >100-char header must get distinct IDs (full-content hash)."""
|
"""Documents sharing a >100-char header must get distinct IDs (full-content hash)."""
|
||||||
_patch_mcp_server(monkeypatch, config, kg)
|
_patch_mcp_server(monkeypatch, config, kg)
|
||||||
@@ -495,6 +542,41 @@ class TestWriteTools:
|
|||||||
result = tool_delete_drawer("nonexistent_drawer")
|
result = tool_delete_drawer("nonexistent_drawer")
|
||||||
assert result["success"] is False
|
assert result["success"] is False
|
||||||
|
|
||||||
|
def test_check_duplicate_handles_none_metadata(self, monkeypatch, config, kg):
|
||||||
|
"""tool_check_duplicate must tolerate None entries in the result lists
|
||||||
|
that ChromaDB 1.5.x returns for partially-flushed rows.
|
||||||
|
|
||||||
|
Previously ``meta = results["metadatas"][0][i]`` was unguarded and
|
||||||
|
raised ``AttributeError: 'NoneType' object has no attribute 'get'``
|
||||||
|
the moment the first matching drawer came back with None metadata —
|
||||||
|
surfacing to the MCP client as the uninformative
|
||||||
|
``"Duplicate check failed"`` because the broad ``except Exception``
|
||||||
|
wrapper swallows the real cause.
|
||||||
|
"""
|
||||||
|
_patch_mcp_server(monkeypatch, config, kg)
|
||||||
|
from mempalace import mcp_server
|
||||||
|
|
||||||
|
mock_col = MagicMock()
|
||||||
|
mock_col.query.return_value = {
|
||||||
|
"ids": [["d1", "d2"]],
|
||||||
|
"distances": [[0.05, 0.05]],
|
||||||
|
"metadatas": [[{"wing": "w", "room": "r"}, None]],
|
||||||
|
"documents": [["first doc", None]],
|
||||||
|
}
|
||||||
|
monkeypatch.setattr(mcp_server, "_get_collection", lambda: mock_col)
|
||||||
|
|
||||||
|
result = mcp_server.tool_check_duplicate("any content", threshold=0.5)
|
||||||
|
|
||||||
|
# Both entries land in matches (above threshold), None ones rendered
|
||||||
|
# with sentinel values rather than crashing the whole response.
|
||||||
|
assert result.get("is_duplicate") is True
|
||||||
|
assert len(result["matches"]) == 2
|
||||||
|
# The None-metadata entry falls back to sentinels.
|
||||||
|
none_entry = result["matches"][1]
|
||||||
|
assert none_entry["wing"] == "?"
|
||||||
|
assert none_entry["room"] == "?"
|
||||||
|
assert none_entry["content"] == ""
|
||||||
|
|
||||||
def test_check_duplicate(self, monkeypatch, config, palace_path, seeded_collection, kg):
|
def test_check_duplicate(self, monkeypatch, config, palace_path, seeded_collection, kg):
|
||||||
_patch_mcp_server(monkeypatch, config, kg)
|
_patch_mcp_server(monkeypatch, config, kg)
|
||||||
from mempalace.mcp_server import tool_check_duplicate
|
from mempalace.mcp_server import tool_check_duplicate
|
||||||
@@ -788,6 +870,59 @@ class TestKGTools:
|
|||||||
result = tool_kg_stats()
|
result = tool_kg_stats()
|
||||||
assert result["entities"] >= 4
|
assert result["entities"] >= 4
|
||||||
|
|
||||||
|
# --- Date validation at the MCP boundary (issue #1164) ---
|
||||||
|
|
||||||
|
def test_kg_add_rejects_invalid_valid_from(self, monkeypatch, config, palace_path, kg):
|
||||||
|
_patch_mcp_server(monkeypatch, config, kg)
|
||||||
|
from mempalace.mcp_server import tool_kg_add
|
||||||
|
|
||||||
|
result = tool_kg_add(
|
||||||
|
subject="Alice",
|
||||||
|
predicate="likes",
|
||||||
|
object="coffee",
|
||||||
|
valid_from="Jan 2025",
|
||||||
|
)
|
||||||
|
assert result["success"] is False
|
||||||
|
assert "valid_from" in result["error"]
|
||||||
|
assert "ISO-8601" in result["error"]
|
||||||
|
|
||||||
|
def test_kg_query_rejects_invalid_as_of(self, monkeypatch, config, palace_path, seeded_kg):
|
||||||
|
_patch_mcp_server(monkeypatch, config, seeded_kg)
|
||||||
|
from mempalace.mcp_server import tool_kg_query
|
||||||
|
|
||||||
|
result = tool_kg_query(entity="Max", as_of="March 2026")
|
||||||
|
assert "error" in result
|
||||||
|
assert "as_of" in result["error"]
|
||||||
|
|
||||||
|
def test_kg_invalidate_rejects_invalid_ended(self, monkeypatch, config, palace_path, seeded_kg):
|
||||||
|
_patch_mcp_server(monkeypatch, config, seeded_kg)
|
||||||
|
from mempalace.mcp_server import tool_kg_invalidate
|
||||||
|
|
||||||
|
result = tool_kg_invalidate(
|
||||||
|
subject="Max",
|
||||||
|
predicate="does",
|
||||||
|
object="chess",
|
||||||
|
ended="yesterday",
|
||||||
|
)
|
||||||
|
assert result["success"] is False
|
||||||
|
assert "ended" in result["error"]
|
||||||
|
|
||||||
|
def test_kg_query_rejects_partial_iso_dates(self, monkeypatch, config, palace_path, seeded_kg):
|
||||||
|
_patch_mcp_server(monkeypatch, config, seeded_kg)
|
||||||
|
from mempalace.mcp_server import tool_kg_query
|
||||||
|
|
||||||
|
# Partial ISO dates are rejected: KG queries compare TEXT dates
|
||||||
|
# lexicographically, so "2026-01-01" <= "2026" is False, which
|
||||||
|
# silently excludes facts. Reject at the boundary — only YYYY-MM-DD
|
||||||
|
# produces correct results.
|
||||||
|
for value in ("2026", "2026-03"):
|
||||||
|
result = tool_kg_query(entity="Max", as_of=value)
|
||||||
|
assert "error" in result, f"accepted partial date {value!r}: {result}"
|
||||||
|
|
||||||
|
# Full ISO-8601 dates still pass.
|
||||||
|
result = tool_kg_query(entity="Max", as_of="2026-03-15")
|
||||||
|
assert "error" not in result, f"rejected valid date: {result}"
|
||||||
|
|
||||||
|
|
||||||
# ── Diary Tools ─────────────────────────────────────────────────────────
|
# ── Diary Tools ─────────────────────────────────────────────────────────
|
||||||
|
|
||||||
@@ -1043,6 +1178,25 @@ class TestCacheInvalidation:
|
|||||||
assert "Reconnected" in result["message"]
|
assert "Reconnected" in result["message"]
|
||||||
assert isinstance(result["drawers"], int)
|
assert isinstance(result["drawers"], int)
|
||||||
|
|
||||||
|
def test_reconnect_closes_shared_backend(self, monkeypatch, config, kg):
|
||||||
|
_patch_mcp_server(monkeypatch, config, kg)
|
||||||
|
from unittest.mock import MagicMock
|
||||||
|
|
||||||
|
from mempalace import mcp_server, palace
|
||||||
|
|
||||||
|
close_palace = MagicMock()
|
||||||
|
monkeypatch.setattr(palace._DEFAULT_BACKEND, "close_palace", close_palace)
|
||||||
|
|
||||||
|
class _FakeCol:
|
||||||
|
def count(self):
|
||||||
|
return 7
|
||||||
|
|
||||||
|
monkeypatch.setattr(mcp_server, "_get_collection", lambda create=False: _FakeCol())
|
||||||
|
|
||||||
|
result = mcp_server.tool_reconnect()
|
||||||
|
assert result["success"] is True
|
||||||
|
close_palace.assert_called_once_with(config.palace_path)
|
||||||
|
|
||||||
def test_get_collection_create_true_avoids_get_or_create_on_reopen(
|
def test_get_collection_create_true_avoids_get_or_create_on_reopen(
|
||||||
self, monkeypatch, config, palace_path, kg
|
self, monkeypatch, config, palace_path, kg
|
||||||
):
|
):
|
||||||
@@ -1143,3 +1297,296 @@ class TestCacheInvalidation:
|
|||||||
for kwargs in captured["get"]:
|
for kwargs in captured["get"]:
|
||||||
assert "embedding_function" in kwargs
|
assert "embedding_function" in kwargs
|
||||||
assert kwargs["embedding_function"] is not None
|
assert kwargs["embedding_function"] is not None
|
||||||
|
|
||||||
|
def test_get_collection_retries_once_on_exception(self, monkeypatch, config, palace_path, kg):
|
||||||
|
"""Regression: a transient failure inside _get_collection must trigger
|
||||||
|
one retry after clearing the client/collection caches, not silently
|
||||||
|
return None.
|
||||||
|
|
||||||
|
Before this fix, a stale chromadb handle (e.g. the rust bindings
|
||||||
|
invalidating after an out-of-band write) would raise inside the
|
||||||
|
single ``try`` block, get swallowed by ``except Exception: return
|
||||||
|
None``, and every subsequent tool call would hit the same poisoned
|
||||||
|
cache returning None. The retry forces ``_get_client()`` to rebuild
|
||||||
|
the client (which re-runs ``quarantine_stale_hnsw`` per #1322), so
|
||||||
|
the second attempt heals the common stale-handle case.
|
||||||
|
"""
|
||||||
|
_patch_mcp_server(monkeypatch, config, kg)
|
||||||
|
_client, _col = _get_collection(palace_path, create=True)
|
||||||
|
del _client
|
||||||
|
from mempalace import mcp_server
|
||||||
|
|
||||||
|
# Force a cold cache so the first call goes through the open path.
|
||||||
|
mcp_server._client_cache = None
|
||||||
|
mcp_server._collection_cache = None
|
||||||
|
|
||||||
|
real_get_client = mcp_server._get_client
|
||||||
|
attempts = {"count": 0}
|
||||||
|
|
||||||
|
def flaky_get_client():
|
||||||
|
attempts["count"] += 1
|
||||||
|
if attempts["count"] == 1:
|
||||||
|
raise RuntimeError("simulated transient chromadb failure")
|
||||||
|
return real_get_client()
|
||||||
|
|
||||||
|
monkeypatch.setattr(mcp_server, "_get_client", flaky_get_client)
|
||||||
|
|
||||||
|
col = mcp_server._get_collection()
|
||||||
|
|
||||||
|
# Both attempts ran and the second succeeded.
|
||||||
|
assert attempts["count"] == 2
|
||||||
|
assert col is not None
|
||||||
|
|
||||||
|
def test_get_collection_returns_none_after_two_failures(
|
||||||
|
self, monkeypatch, config, palace_path, kg
|
||||||
|
):
|
||||||
|
"""If both attempts fail, return None (matches the prior contract for
|
||||||
|
permanent failures — only the transient case is now self-healing)."""
|
||||||
|
_patch_mcp_server(monkeypatch, config, kg)
|
||||||
|
_client, _col = _get_collection(palace_path, create=True)
|
||||||
|
del _client
|
||||||
|
from mempalace import mcp_server
|
||||||
|
|
||||||
|
mcp_server._client_cache = None
|
||||||
|
mcp_server._collection_cache = None
|
||||||
|
|
||||||
|
attempts = {"count": 0}
|
||||||
|
|
||||||
|
def always_fails():
|
||||||
|
attempts["count"] += 1
|
||||||
|
raise RuntimeError("permanent chromadb failure")
|
||||||
|
|
||||||
|
monkeypatch.setattr(mcp_server, "_get_client", always_fails)
|
||||||
|
|
||||||
|
col = mcp_server._get_collection()
|
||||||
|
|
||||||
|
assert attempts["count"] == 2
|
||||||
|
assert col is None
|
||||||
|
|
||||||
|
|
||||||
|
class TestKGLazyCache:
|
||||||
|
"""Lazy per-path KnowledgeGraph cache (issue #1136)."""
|
||||||
|
|
||||||
|
def test_lazy_init_no_import_side_effect(self, tmp_path):
|
||||||
|
"""Importing mcp_server must not create knowledge_graph.sqlite3.
|
||||||
|
|
||||||
|
Runs in a fresh subprocess with HOME pointed at tmp_path so the
|
||||||
|
assertion targets a clean filesystem, independent of conftest's
|
||||||
|
session-level HOME patch.
|
||||||
|
"""
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
|
||||||
|
kg_file = tmp_path / ".mempalace" / "knowledge_graph.sqlite3"
|
||||||
|
env = {k: v for k, v in os.environ.items() if not k.startswith("MEMPAL")}
|
||||||
|
env["HOME"] = str(tmp_path)
|
||||||
|
env["USERPROFILE"] = str(tmp_path)
|
||||||
|
result = subprocess.run(
|
||||||
|
[sys.executable, "-c", "import mempalace.mcp_server"],
|
||||||
|
env=env,
|
||||||
|
capture_output=True,
|
||||||
|
text=True,
|
||||||
|
timeout=30,
|
||||||
|
)
|
||||||
|
assert result.returncode == 0, f"import failed: {result.stderr}"
|
||||||
|
assert not kg_file.exists(), f"import created sqlite file at {kg_file} as a side effect"
|
||||||
|
|
||||||
|
def test_get_kg_returns_same_instance(self, tmp_path, monkeypatch):
|
||||||
|
"""Two calls with the same resolved path return the same KG."""
|
||||||
|
from mempalace import mcp_server
|
||||||
|
|
||||||
|
monkeypatch.setattr(mcp_server, "_kg_by_path", {})
|
||||||
|
monkeypatch.setattr(mcp_server, "_palace_flag_given", True)
|
||||||
|
monkeypatch.setenv("MEMPALACE_PALACE_PATH", str(tmp_path))
|
||||||
|
|
||||||
|
kg1 = mcp_server._get_kg()
|
||||||
|
kg2 = mcp_server._get_kg()
|
||||||
|
assert kg1 is kg2
|
||||||
|
assert len(mcp_server._kg_by_path) == 1
|
||||||
|
|
||||||
|
def test_get_kg_different_paths_different_instances(self, tmp_path, monkeypatch):
|
||||||
|
"""Different palace paths map to different KG instances."""
|
||||||
|
from mempalace import mcp_server
|
||||||
|
|
||||||
|
tmp_a = tmp_path / "a"
|
||||||
|
tmp_b = tmp_path / "b"
|
||||||
|
tmp_a.mkdir()
|
||||||
|
tmp_b.mkdir()
|
||||||
|
|
||||||
|
monkeypatch.setattr(mcp_server, "_kg_by_path", {})
|
||||||
|
monkeypatch.setattr(mcp_server, "_palace_flag_given", True)
|
||||||
|
|
||||||
|
monkeypatch.setenv("MEMPALACE_PALACE_PATH", str(tmp_a))
|
||||||
|
kg_a = mcp_server._get_kg()
|
||||||
|
monkeypatch.setenv("MEMPALACE_PALACE_PATH", str(tmp_b))
|
||||||
|
kg_b = mcp_server._get_kg()
|
||||||
|
|
||||||
|
assert kg_a is not kg_b
|
||||||
|
assert len(mcp_server._kg_by_path) == 2
|
||||||
|
|
||||||
|
def test_multi_tenant_env_switch(self, tmp_path, monkeypatch):
|
||||||
|
"""The issue #1136 acceptance scenario.
|
||||||
|
|
||||||
|
Rotating MEMPALACE_PALACE_PATH between MCP tool calls must route
|
||||||
|
each call to the correct tenant's KG sqlite file.
|
||||||
|
"""
|
||||||
|
from mempalace import mcp_server
|
||||||
|
|
||||||
|
tmp_a = tmp_path / "tenant_a"
|
||||||
|
tmp_b = tmp_path / "tenant_b"
|
||||||
|
tmp_a.mkdir()
|
||||||
|
tmp_b.mkdir()
|
||||||
|
|
||||||
|
monkeypatch.setattr(mcp_server, "_kg_by_path", {})
|
||||||
|
monkeypatch.setattr(mcp_server, "_palace_flag_given", True)
|
||||||
|
|
||||||
|
monkeypatch.setenv("MEMPALACE_PALACE_PATH", str(tmp_a))
|
||||||
|
add_result = mcp_server.tool_kg_add(
|
||||||
|
subject="alice_secret",
|
||||||
|
predicate="owns",
|
||||||
|
object="repo_a",
|
||||||
|
)
|
||||||
|
assert add_result.get("success") is True, add_result
|
||||||
|
|
||||||
|
monkeypatch.setenv("MEMPALACE_PALACE_PATH", str(tmp_b))
|
||||||
|
query_b = mcp_server.tool_kg_query(entity="alice_secret")
|
||||||
|
assert query_b.get("count", 0) == 0, f"tenant B leaked tenant A's fact: {query_b}"
|
||||||
|
|
||||||
|
monkeypatch.setenv("MEMPALACE_PALACE_PATH", str(tmp_a))
|
||||||
|
query_a = mcp_server.tool_kg_query(entity="alice_secret")
|
||||||
|
assert query_a.get("count", 0) >= 1, f"tenant A lost its own fact: {query_a}"
|
||||||
|
|
||||||
|
def test_cache_thread_safe(self, tmp_path, monkeypatch):
|
||||||
|
"""Concurrent _get_kg() for the same path yields one instance."""
|
||||||
|
import concurrent.futures
|
||||||
|
from mempalace import mcp_server
|
||||||
|
|
||||||
|
monkeypatch.setattr(mcp_server, "_kg_by_path", {})
|
||||||
|
monkeypatch.setattr(mcp_server, "_palace_flag_given", True)
|
||||||
|
monkeypatch.setenv("MEMPALACE_PALACE_PATH", str(tmp_path))
|
||||||
|
|
||||||
|
with concurrent.futures.ThreadPoolExecutor(max_workers=16) as pool:
|
||||||
|
results = list(pool.map(lambda _: mcp_server._get_kg(), range(16)))
|
||||||
|
|
||||||
|
ids = {id(kg) for kg in results}
|
||||||
|
assert len(ids) == 1, f"expected 1 unique instance, got {len(ids)}"
|
||||||
|
assert len(mcp_server._kg_by_path) == 1
|
||||||
|
|
||||||
|
def test_tool_reconnect_drains_kg_cache(self, monkeypatch):
|
||||||
|
"""``tool_reconnect`` must close cached KG instances and clear the dict.
|
||||||
|
|
||||||
|
Without this, an external replacement of ``knowledge_graph.sqlite3``
|
||||||
|
leaves the server pinned to a stale ``sqlite3.Connection``.
|
||||||
|
"""
|
||||||
|
from mempalace import mcp_server
|
||||||
|
|
||||||
|
class _FakeKG:
|
||||||
|
def __init__(self):
|
||||||
|
self.closed = False
|
||||||
|
|
||||||
|
def close(self):
|
||||||
|
self.closed = True
|
||||||
|
|
||||||
|
fake_a = _FakeKG()
|
||||||
|
fake_b = _FakeKG()
|
||||||
|
monkeypatch.setattr(mcp_server, "_kg_by_path", {"/a": fake_a, "/b": fake_b})
|
||||||
|
# Bypass real ChromaDB so the test isolates KG-cache behaviour.
|
||||||
|
monkeypatch.setattr(mcp_server, "_get_collection", lambda: None)
|
||||||
|
|
||||||
|
mcp_server.tool_reconnect()
|
||||||
|
|
||||||
|
assert fake_a.closed is True
|
||||||
|
assert fake_b.closed is True
|
||||||
|
assert mcp_server._kg_by_path == {}
|
||||||
|
|
||||||
|
def test_tool_reconnect_swallows_kg_close_errors(self, monkeypatch):
|
||||||
|
"""A failing ``close()`` on one cached KG must not block cache clearing."""
|
||||||
|
from mempalace import mcp_server
|
||||||
|
|
||||||
|
class _BoomKG:
|
||||||
|
def close(self):
|
||||||
|
raise RuntimeError("boom")
|
||||||
|
|
||||||
|
monkeypatch.setattr(mcp_server, "_kg_by_path", {"/a": _BoomKG()})
|
||||||
|
monkeypatch.setattr(mcp_server, "_get_collection", lambda: None)
|
||||||
|
|
||||||
|
mcp_server.tool_reconnect()
|
||||||
|
|
||||||
|
assert mcp_server._kg_by_path == {}
|
||||||
|
|
||||||
|
def test_call_kg_retries_after_concurrent_close(self, monkeypatch):
|
||||||
|
"""A KG closed mid-handler must trigger a one-shot retry with a fresh
|
||||||
|
instance — not surface a -32000 to the MCP client."""
|
||||||
|
import sqlite3 as _sqlite3
|
||||||
|
|
||||||
|
from mempalace import mcp_server
|
||||||
|
|
||||||
|
path = "/fake/palace/knowledge_graph.sqlite3"
|
||||||
|
monkeypatch.setattr(mcp_server, "_resolve_kg_path", lambda: path)
|
||||||
|
|
||||||
|
class _ClosedKG:
|
||||||
|
def query_entity(self, entity, **kwargs):
|
||||||
|
raise _sqlite3.ProgrammingError("Cannot operate on a closed database")
|
||||||
|
|
||||||
|
class _FreshKG:
|
||||||
|
def query_entity(self, entity, **kwargs):
|
||||||
|
return [{"entity": entity}]
|
||||||
|
|
||||||
|
cache = {os.path.abspath(path): _ClosedKG()}
|
||||||
|
monkeypatch.setattr(mcp_server, "_kg_by_path", cache)
|
||||||
|
|
||||||
|
# Second _get_kg() call (after the cache eviction) constructs a new
|
||||||
|
# KG. Patch the constructor so we don't open a real sqlite file.
|
||||||
|
monkeypatch.setattr(mcp_server, "KnowledgeGraph", lambda **_: _FreshKG())
|
||||||
|
|
||||||
|
result = mcp_server._call_kg(lambda kg: kg.query_entity("Alice"))
|
||||||
|
assert result == [{"entity": "Alice"}]
|
||||||
|
# The closed instance must be evicted; the fresh one must be cached.
|
||||||
|
assert isinstance(cache[os.path.abspath(path)], _FreshKG)
|
||||||
|
|
||||||
|
def test_call_kg_does_not_retry_on_other_errors(self, monkeypatch):
|
||||||
|
"""Non-ProgrammingError exceptions must propagate without retry —
|
||||||
|
we don't want the retry guard masking real bugs."""
|
||||||
|
from mempalace import mcp_server
|
||||||
|
|
||||||
|
path = "/fake/palace/knowledge_graph.sqlite3"
|
||||||
|
monkeypatch.setattr(mcp_server, "_resolve_kg_path", lambda: path)
|
||||||
|
|
||||||
|
calls = {"count": 0}
|
||||||
|
|
||||||
|
class _FailingKG:
|
||||||
|
def query_entity(self, entity, **kwargs):
|
||||||
|
calls["count"] += 1
|
||||||
|
raise ValueError("bad input")
|
||||||
|
|
||||||
|
monkeypatch.setattr(mcp_server, "_kg_by_path", {os.path.abspath(path): _FailingKG()})
|
||||||
|
monkeypatch.setattr(mcp_server, "KnowledgeGraph", lambda **_: _FailingKG())
|
||||||
|
|
||||||
|
with pytest.raises(ValueError, match="bad input"):
|
||||||
|
mcp_server._call_kg(lambda kg: kg.query_entity("Alice"))
|
||||||
|
assert calls["count"] == 1, "non-ProgrammingError must not trigger retry"
|
||||||
|
|
||||||
|
def test_call_kg_gives_up_after_one_retry(self, monkeypatch):
|
||||||
|
"""If the second attempt also hits a closed DB, give up rather than
|
||||||
|
loop forever — a sustained close-stream is a different bug."""
|
||||||
|
import sqlite3 as _sqlite3
|
||||||
|
|
||||||
|
from mempalace import mcp_server
|
||||||
|
|
||||||
|
path = "/fake/palace/knowledge_graph.sqlite3"
|
||||||
|
monkeypatch.setattr(mcp_server, "_resolve_kg_path", lambda: path)
|
||||||
|
|
||||||
|
calls = {"count": 0}
|
||||||
|
|
||||||
|
class _AlwaysClosedKG:
|
||||||
|
def query_entity(self, entity, **kwargs):
|
||||||
|
calls["count"] += 1
|
||||||
|
raise _sqlite3.ProgrammingError("closed again")
|
||||||
|
|
||||||
|
cache = {}
|
||||||
|
monkeypatch.setattr(mcp_server, "_kg_by_path", cache)
|
||||||
|
monkeypatch.setattr(mcp_server, "KnowledgeGraph", lambda **_: _AlwaysClosedKG())
|
||||||
|
|
||||||
|
with pytest.raises(_sqlite3.ProgrammingError):
|
||||||
|
mcp_server._call_kg(lambda kg: kg.query_entity("Alice"))
|
||||||
|
assert calls["count"] == 2, "expected exactly one retry beyond the initial attempt"
|
||||||
|
|||||||
+100
-1
@@ -4,7 +4,7 @@ import os
|
|||||||
from types import SimpleNamespace
|
from types import SimpleNamespace
|
||||||
from unittest.mock import MagicMock, patch
|
from unittest.mock import MagicMock, patch
|
||||||
|
|
||||||
from mempalace.migrate import _restore_stale_palace, migrate
|
from mempalace.migrate import collection_write_roundtrip_works, _restore_stale_palace, migrate
|
||||||
|
|
||||||
|
|
||||||
def test_migrate_requires_palace_database(tmp_path, capsys):
|
def test_migrate_requires_palace_database(tmp_path, capsys):
|
||||||
@@ -101,3 +101,102 @@ def test_restore_stale_palace_logs_and_swallows_on_failure(tmp_path, capsys):
|
|||||||
assert "CRITICAL" in out
|
assert "CRITICAL" in out
|
||||||
assert os.fspath(palace_path) in out
|
assert os.fspath(palace_path) in out
|
||||||
assert os.fspath(stale_path) in out
|
assert os.fspath(stale_path) in out
|
||||||
|
|
||||||
|
|
||||||
|
class _FakeGetResult:
|
||||||
|
def __init__(self, ids):
|
||||||
|
self.ids = ids
|
||||||
|
|
||||||
|
|
||||||
|
class _WritableFakeCollection:
|
||||||
|
def __init__(self):
|
||||||
|
self.ids = set()
|
||||||
|
self.deleted = []
|
||||||
|
|
||||||
|
def upsert(self, *, ids, documents, metadatas):
|
||||||
|
self.ids.update(ids)
|
||||||
|
|
||||||
|
def get(self, *, ids, include=None):
|
||||||
|
return _FakeGetResult([drawer_id for drawer_id in ids if drawer_id in self.ids])
|
||||||
|
|
||||||
|
def delete(self, *, ids=None, where=None):
|
||||||
|
for drawer_id in ids or []:
|
||||||
|
self.ids.discard(drawer_id)
|
||||||
|
self.deleted.append(drawer_id)
|
||||||
|
|
||||||
|
|
||||||
|
class _SilentWriteDropCollection(_WritableFakeCollection):
|
||||||
|
def upsert(self, *, ids, documents, metadatas):
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
class _SilentDeleteDropCollection(_WritableFakeCollection):
|
||||||
|
def delete(self, *, ids=None, where=None):
|
||||||
|
self.deleted.extend(ids or [])
|
||||||
|
|
||||||
|
|
||||||
|
def test_collection_write_roundtrip_works_when_probe_persists_and_deletes():
|
||||||
|
col = _WritableFakeCollection()
|
||||||
|
|
||||||
|
assert collection_write_roundtrip_works(col) is True
|
||||||
|
assert col.ids == set()
|
||||||
|
assert len(col.deleted) == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_collection_write_roundtrip_fails_when_upsert_silently_drops():
|
||||||
|
col = _SilentWriteDropCollection()
|
||||||
|
|
||||||
|
assert collection_write_roundtrip_works(col) is False
|
||||||
|
assert col.ids == set()
|
||||||
|
|
||||||
|
|
||||||
|
def test_collection_write_roundtrip_fails_when_delete_silently_drops():
|
||||||
|
col = _SilentDeleteDropCollection()
|
||||||
|
|
||||||
|
assert collection_write_roundtrip_works(col) is False
|
||||||
|
assert len(col.ids) == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_migrate_dry_run_rebuilds_when_collection_is_readable_but_not_writable(tmp_path, capsys):
|
||||||
|
palace_dir = tmp_path / "palace"
|
||||||
|
palace_dir.mkdir()
|
||||||
|
(palace_dir / "chroma.sqlite3").write_text("db")
|
||||||
|
|
||||||
|
fake_col = MagicMock()
|
||||||
|
fake_col.count.return_value = 102
|
||||||
|
|
||||||
|
drawers = [
|
||||||
|
{
|
||||||
|
"id": "id1",
|
||||||
|
"document": "hello",
|
||||||
|
"metadata": {"wing": "test-wing", "room": "general"},
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("mempalace.migrate.detect_chromadb_version", return_value="1.x"),
|
||||||
|
patch("mempalace.backends.chroma.ChromaBackend") as mock_backend,
|
||||||
|
patch(
|
||||||
|
"mempalace.migrate.collection_write_roundtrip_works", return_value=False
|
||||||
|
) as mock_probe,
|
||||||
|
patch(
|
||||||
|
"mempalace.migrate.extract_drawers_from_sqlite", return_value=drawers
|
||||||
|
) as mock_extract,
|
||||||
|
):
|
||||||
|
mock_backend.backend_version.return_value = "1.5.8"
|
||||||
|
mock_backend.return_value.get_collection.return_value = fake_col
|
||||||
|
|
||||||
|
result = migrate(str(palace_dir), dry_run=True)
|
||||||
|
|
||||||
|
out = capsys.readouterr().out
|
||||||
|
|
||||||
|
assert result is True
|
||||||
|
mock_probe.assert_called_once_with(fake_col)
|
||||||
|
mock_extract.assert_called_once_with(
|
||||||
|
os.path.join(os.path.abspath(os.fspath(palace_dir)), "chroma.sqlite3")
|
||||||
|
)
|
||||||
|
|
||||||
|
assert "readable by chromadb 1.5.8, but write/delete verification failed" in out
|
||||||
|
assert "Rebuilding from SQLite" in out
|
||||||
|
assert "Extracted 1 drawers from SQLite" in out
|
||||||
|
assert "DRY RUN" in out
|
||||||
|
|||||||
@@ -699,6 +699,83 @@ def test_mine_keyboard_interrupt_quotes_path_with_spaces_in_resume_hint(tmp_path
|
|||||||
assert f"mempalace mine {shlex.quote(str(project_root))}" in out
|
assert f"mempalace mine {shlex.quote(str(project_root))}" in out
|
||||||
|
|
||||||
|
|
||||||
|
def test_skip_filenames_includes_lockfiles():
|
||||||
|
"""pnpm-lock.yaml and yarn.lock must be skipped alongside package-lock.json
|
||||||
|
so a Windows mine over a typical JS monorepo doesn't OOM the ONNX embedder
|
||||||
|
on a 24K-line lockfile (#1296)."""
|
||||||
|
from mempalace import miner
|
||||||
|
|
||||||
|
assert "package-lock.json" in miner.SKIP_FILENAMES
|
||||||
|
assert "pnpm-lock.yaml" in miner.SKIP_FILENAMES
|
||||||
|
assert "yarn.lock" in miner.SKIP_FILENAMES
|
||||||
|
|
||||||
|
|
||||||
|
def test_process_file_skips_when_chunks_exceed_max(tmp_path, monkeypatch):
|
||||||
|
"""A file producing more than MAX_CHUNKS_PER_FILE chunks must be skipped
|
||||||
|
with a clear message and zero upserts. Generated artifacts (CSVs, lock
|
||||||
|
files not in SKIP_FILENAMES) hit this — the cap is what prevents ONNX
|
||||||
|
bad_alloc on Windows when the embedder is asked to swallow thousands of
|
||||||
|
chunks in one batch (#1296)."""
|
||||||
|
from unittest.mock import MagicMock
|
||||||
|
|
||||||
|
from mempalace import miner
|
||||||
|
|
||||||
|
monkeypatch.setattr(miner, "MAX_CHUNKS_PER_FILE", 5)
|
||||||
|
over_cap = [{"content": f"chunk {i}", "chunk_index": i} for i in range(7)]
|
||||||
|
monkeypatch.setattr(miner, "chunk_text", lambda content, source_file: over_cap)
|
||||||
|
|
||||||
|
source = tmp_path / "huge.csv"
|
||||||
|
source.write_text("col1,col2\n" + "x,y\n" * 500, encoding="utf-8")
|
||||||
|
col = MagicMock()
|
||||||
|
col.get.return_value = {"ids": []}
|
||||||
|
|
||||||
|
drawers, room = miner.process_file(
|
||||||
|
source,
|
||||||
|
tmp_path,
|
||||||
|
col,
|
||||||
|
"wing",
|
||||||
|
[{"name": "general", "description": "General"}],
|
||||||
|
"agent",
|
||||||
|
False,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert drawers == 0
|
||||||
|
col.upsert.assert_not_called()
|
||||||
|
|
||||||
|
|
||||||
|
def test_mine_arbitrary_exception_prints_summary_and_reraises(tmp_path, capsys):
|
||||||
|
"""A non-KeyboardInterrupt exception mid-mine must surface a summary
|
||||||
|
banner before propagating, so users don't see a silent exit-0 with no
|
||||||
|
completion message (#1296 Failure 2). Re-raise preserves the traceback
|
||||||
|
and yields a non-zero exit code."""
|
||||||
|
import pytest
|
||||||
|
from unittest.mock import patch
|
||||||
|
|
||||||
|
project_root = tmp_path / "proj"
|
||||||
|
project_root.mkdir()
|
||||||
|
_make_minable_project(project_root, n_files=4)
|
||||||
|
palace_path = project_root / "palace"
|
||||||
|
|
||||||
|
call_count = {"n": 0}
|
||||||
|
|
||||||
|
def fake_process_file(*args, **kwargs):
|
||||||
|
call_count["n"] += 1
|
||||||
|
if call_count["n"] == 2:
|
||||||
|
raise RuntimeError("simulated ONNX bad_alloc")
|
||||||
|
return (1, "general")
|
||||||
|
|
||||||
|
with patch("mempalace.miner.process_file", side_effect=fake_process_file):
|
||||||
|
with pytest.raises(RuntimeError, match="simulated ONNX bad_alloc"):
|
||||||
|
mine(str(project_root), str(palace_path))
|
||||||
|
|
||||||
|
out = capsys.readouterr().out
|
||||||
|
assert "Mine aborted by exception." in out
|
||||||
|
assert "files_processed: 1/" in out
|
||||||
|
assert "drawers_filed:" in out
|
||||||
|
assert "RuntimeError: simulated ONNX bad_alloc" in out
|
||||||
|
assert "upserted idempotently" in out
|
||||||
|
|
||||||
|
|
||||||
def test_mine_cleans_up_pid_file_on_interrupt(tmp_path):
|
def test_mine_cleans_up_pid_file_on_interrupt(tmp_path):
|
||||||
"""Our own PID entry in mine.pid is removed in the finally clause."""
|
"""Our own PID entry in mine.pid is removed in the finally clause."""
|
||||||
import pytest
|
import pytest
|
||||||
|
|||||||
@@ -135,19 +135,77 @@ def test_different_palaces_dont_conflict(tmp_path, monkeypatch):
|
|||||||
|
|
||||||
|
|
||||||
def test_palace_path_is_normalized(tmp_path, monkeypatch):
|
def test_palace_path_is_normalized(tmp_path, monkeypatch):
|
||||||
"""Relative and absolute forms of the same path must use the same lock."""
|
"""Relative and absolute forms of the same path must use the same lock.
|
||||||
|
|
||||||
|
Cross-process variant: a child holds the absolute form, a relative form
|
||||||
|
in the parent must hash to the same lock key and raise
|
||||||
|
``MineAlreadyRunning``. (The same-thread case is now a re-entrant
|
||||||
|
pass-through by design — see ``test_reentrant_same_thread_passes_through``
|
||||||
|
— so we exercise the normalization invariant across a process boundary
|
||||||
|
where re-entrance does not apply.)
|
||||||
|
"""
|
||||||
monkeypatch.setenv("HOME", str(tmp_path))
|
monkeypatch.setenv("HOME", str(tmp_path))
|
||||||
monkeypatch.chdir(tmp_path)
|
monkeypatch.chdir(tmp_path)
|
||||||
os.makedirs(tmp_path / "palace", exist_ok=True)
|
os.makedirs(tmp_path / "palace", exist_ok=True)
|
||||||
absolute = str(tmp_path / "palace")
|
absolute = str(tmp_path / "palace")
|
||||||
relative = "palace"
|
ready = str(tmp_path / "ready")
|
||||||
|
release = str(tmp_path / "release")
|
||||||
|
|
||||||
# Hold the lock with the absolute form; attempting to re-acquire with
|
ctx = _get_mp_context()
|
||||||
# the relative form (which resolves to the same absolute path) must fail.
|
holder = ctx.Process(target=_hold_lock, args=(absolute, ready, release))
|
||||||
with mine_palace_lock(absolute):
|
holder.start()
|
||||||
|
try:
|
||||||
|
for _ in range(500):
|
||||||
|
if os.path.exists(ready):
|
||||||
|
break
|
||||||
|
time.sleep(0.01)
|
||||||
|
assert os.path.exists(ready), "holder failed to acquire lock in time"
|
||||||
|
|
||||||
|
# Parent holds CWD = tmp_path so "palace" is the same on-disk dir as
|
||||||
|
# the absolute form. The lock key is sha256(realpath+normcase) so the
|
||||||
|
# two forms must collide.
|
||||||
with pytest.raises(MineAlreadyRunning):
|
with pytest.raises(MineAlreadyRunning):
|
||||||
with mine_palace_lock(relative):
|
with mine_palace_lock("palace"):
|
||||||
pytest.fail("normalized path collision should have raised")
|
pytest.fail("normalized path collision should have raised")
|
||||||
|
finally:
|
||||||
|
open(release, "w").close()
|
||||||
|
holder.join(timeout=5)
|
||||||
|
|
||||||
|
|
||||||
|
def test_reentrant_same_thread_passes_through(tmp_path, monkeypatch):
|
||||||
|
"""Same thread re-acquiring the same palace lock must not deadlock or raise.
|
||||||
|
|
||||||
|
This is the invariant that makes ``ChromaCollection`` write methods (which
|
||||||
|
take ``mine_palace_lock`` for MCP/direct-writer protection) compose with
|
||||||
|
``miner.mine()`` (which already holds the lock for the entire mine
|
||||||
|
pipeline). Without the per-thread re-entrant guard the inner acquire
|
||||||
|
would self-deadlock on the outer flock.
|
||||||
|
"""
|
||||||
|
monkeypatch.setenv("HOME", str(tmp_path))
|
||||||
|
palace = str(tmp_path / "palace")
|
||||||
|
with mine_palace_lock(palace):
|
||||||
|
# Re-enter from the same thread — must yield without raising or hanging.
|
||||||
|
with mine_palace_lock(palace):
|
||||||
|
pass
|
||||||
|
# After the inner exits, the outer is still held: confirm via a
|
||||||
|
# subprocess that tries to acquire and reports back.
|
||||||
|
ctx = _get_mp_context()
|
||||||
|
result_q = ctx.Queue()
|
||||||
|
child = ctx.Process(target=_try_acquire_expect_busy, args=(palace, result_q))
|
||||||
|
child.start()
|
||||||
|
child.join(timeout=5)
|
||||||
|
assert (
|
||||||
|
result_q.get(timeout=1) == "busy"
|
||||||
|
), "outer lock should still be held by parent after inner re-entrant exit"
|
||||||
|
|
||||||
|
|
||||||
|
def _try_acquire_expect_busy(palace_path, result_q):
|
||||||
|
"""Helper: try to acquire, push 'busy' (raised) or 'free' (acquired) into queue."""
|
||||||
|
try:
|
||||||
|
with mine_palace_lock(palace_path):
|
||||||
|
result_q.put("free")
|
||||||
|
except MineAlreadyRunning:
|
||||||
|
result_q.put("busy")
|
||||||
|
|
||||||
|
|
||||||
def test_mine_global_lock_is_alias_for_back_compat(tmp_path, monkeypatch):
|
def test_mine_global_lock_is_alias_for_back_compat(tmp_path, monkeypatch):
|
||||||
|
|||||||
+790
-9
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
import os
|
import os
|
||||||
import sqlite3
|
import sqlite3
|
||||||
from unittest.mock import MagicMock, patch
|
from unittest.mock import MagicMock, call, patch
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
@@ -28,6 +28,16 @@ def test_get_palace_path_fallback():
|
|||||||
assert ".mempalace" in result
|
assert ".mempalace" in result
|
||||||
|
|
||||||
|
|
||||||
|
def test_get_collection_name_from_config():
|
||||||
|
from mempalace.config import get_configured_collection_name
|
||||||
|
|
||||||
|
get_configured_collection_name.cache_clear()
|
||||||
|
with patch("mempalace.config.MempalaceConfig") as mock_config_cls:
|
||||||
|
mock_config_cls.return_value.collection_name = "custom_drawers"
|
||||||
|
assert repair._drawers_collection_name() == "custom_drawers"
|
||||||
|
get_configured_collection_name.cache_clear()
|
||||||
|
|
||||||
|
|
||||||
# ── _paginate_ids ─────────────────────────────────────────────────────
|
# ── _paginate_ids ─────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
@@ -231,8 +241,11 @@ def test_rebuild_index_success(mock_backend_cls, mock_shutil, tmp_path):
|
|||||||
}
|
}
|
||||||
|
|
||||||
mock_new_col = MagicMock()
|
mock_new_col = MagicMock()
|
||||||
|
mock_new_col.count.return_value = 2
|
||||||
|
mock_temp_col = MagicMock()
|
||||||
|
mock_temp_col.count.return_value = 2
|
||||||
mock_backend = _install_mock_backend(mock_backend_cls, mock_col)
|
mock_backend = _install_mock_backend(mock_backend_cls, mock_col)
|
||||||
mock_backend.create_collection.return_value = mock_new_col
|
mock_backend.create_collection.side_effect = [mock_temp_col, mock_new_col]
|
||||||
|
|
||||||
repair.rebuild_index(palace_path=str(tmp_path))
|
repair.rebuild_index(palace_path=str(tmp_path))
|
||||||
|
|
||||||
@@ -240,15 +253,75 @@ def test_rebuild_index_success(mock_backend_cls, mock_shutil, tmp_path):
|
|||||||
mock_shutil.copy2.assert_called_once()
|
mock_shutil.copy2.assert_called_once()
|
||||||
assert "chroma.sqlite3" in str(mock_shutil.copy2.call_args)
|
assert "chroma.sqlite3" in str(mock_shutil.copy2.call_args)
|
||||||
|
|
||||||
# Verify: deleted and recreated.
|
# Verify: deleted and recreated (cosine is the backend default)
|
||||||
mock_backend.delete_collection.assert_called_once_with(str(tmp_path), "mempalace_drawers")
|
assert mock_backend.create_collection.call_args_list == [
|
||||||
mock_backend.create_collection.assert_called_once_with(str(tmp_path), "mempalace_drawers")
|
call(str(tmp_path), "mempalace_drawers__repair_tmp"),
|
||||||
|
call(str(tmp_path), "mempalace_drawers"),
|
||||||
|
]
|
||||||
|
assert mock_backend.delete_collection.call_args_list == [
|
||||||
|
call(str(tmp_path), "mempalace_drawers__repair_tmp"),
|
||||||
|
call(str(tmp_path), "mempalace_drawers"),
|
||||||
|
call(str(tmp_path), "mempalace_drawers__repair_tmp"),
|
||||||
|
]
|
||||||
|
|
||||||
# Verify: used upsert, not add.
|
# Verify: used upsert not add
|
||||||
|
mock_temp_col.upsert.assert_called_once()
|
||||||
mock_new_col.upsert.assert_called_once()
|
mock_new_col.upsert.assert_called_once()
|
||||||
mock_new_col.add.assert_not_called()
|
mock_new_col.add.assert_not_called()
|
||||||
|
|
||||||
|
|
||||||
|
@patch("mempalace.repair.shutil")
|
||||||
|
@patch("mempalace.repair.ChromaBackend")
|
||||||
|
def test_rebuild_index_ignores_missing_temp_collection_at_start(
|
||||||
|
mock_backend_cls, mock_shutil, tmp_path
|
||||||
|
):
|
||||||
|
sqlite_path = tmp_path / "chroma.sqlite3"
|
||||||
|
sqlite3.connect(str(sqlite_path)).close()
|
||||||
|
|
||||||
|
def _fake_copy2(src, dst):
|
||||||
|
with open(dst, "w") as handle:
|
||||||
|
handle.write("backup")
|
||||||
|
|
||||||
|
mock_shutil.copy2.side_effect = _fake_copy2
|
||||||
|
|
||||||
|
mock_col = MagicMock()
|
||||||
|
mock_col.count.return_value = 2
|
||||||
|
mock_col.get.return_value = {
|
||||||
|
"ids": ["id1", "id2"],
|
||||||
|
"documents": ["doc1", "doc2"],
|
||||||
|
"metadatas": [{"wing": "a"}, {"wing": "b"}],
|
||||||
|
}
|
||||||
|
|
||||||
|
mock_new_col = MagicMock()
|
||||||
|
mock_new_col.count.return_value = 2
|
||||||
|
mock_temp_col = MagicMock()
|
||||||
|
mock_temp_col.count.return_value = 2
|
||||||
|
mock_backend = _install_mock_backend(mock_backend_cls, mock_col)
|
||||||
|
mock_backend.create_collection.side_effect = [mock_temp_col, mock_new_col]
|
||||||
|
mock_backend.delete_collection.side_effect = [
|
||||||
|
ValueError("Collection [mempalace_drawers__repair_tmp] does not exist"),
|
||||||
|
None,
|
||||||
|
None,
|
||||||
|
]
|
||||||
|
|
||||||
|
repair.rebuild_index(palace_path=str(tmp_path))
|
||||||
|
|
||||||
|
assert mock_shutil.copy2.call_count == 1
|
||||||
|
assert mock_backend.delete_collection.call_args_list == [
|
||||||
|
call(str(tmp_path), "mempalace_drawers__repair_tmp"),
|
||||||
|
call(str(tmp_path), "mempalace_drawers"),
|
||||||
|
call(str(tmp_path), "mempalace_drawers__repair_tmp"),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def test_delete_collection_if_exists_reraises_unexpected_value_error():
|
||||||
|
mock_backend = MagicMock()
|
||||||
|
mock_backend.delete_collection.side_effect = ValueError("invalid collection name")
|
||||||
|
|
||||||
|
with pytest.raises(ValueError, match="invalid collection name"):
|
||||||
|
repair._delete_collection_if_exists(mock_backend, "/palace", "bad/name")
|
||||||
|
|
||||||
|
|
||||||
@patch("mempalace.repair.shutil")
|
@patch("mempalace.repair.shutil")
|
||||||
@patch("mempalace.repair.ChromaBackend")
|
@patch("mempalace.repair.ChromaBackend")
|
||||||
def test_rebuild_index_error_reading(mock_backend_cls, mock_shutil, tmp_path):
|
def test_rebuild_index_error_reading(mock_backend_cls, mock_shutil, tmp_path):
|
||||||
@@ -269,6 +342,21 @@ def test_check_extraction_safety_passes_when_counts_match(tmp_path):
|
|||||||
repair.check_extraction_safety(str(tmp_path), 500)
|
repair.check_extraction_safety(str(tmp_path), 500)
|
||||||
|
|
||||||
|
|
||||||
|
def test_check_extraction_safety_uses_configured_collection(tmp_path):
|
||||||
|
with patch("mempalace.repair.sqlite_drawer_count", return_value=500) as count:
|
||||||
|
repair.check_extraction_safety(str(tmp_path), 500, collection_name="custom_drawers")
|
||||||
|
count.assert_called_once_with(str(tmp_path), "custom_drawers")
|
||||||
|
|
||||||
|
|
||||||
|
def test_check_extraction_safety_default_uses_configured_collection(tmp_path):
|
||||||
|
with (
|
||||||
|
patch("mempalace.repair._drawers_collection_name", return_value="custom_drawers"),
|
||||||
|
patch("mempalace.repair.sqlite_drawer_count", return_value=500) as count,
|
||||||
|
):
|
||||||
|
repair.check_extraction_safety(str(tmp_path), 500)
|
||||||
|
count.assert_called_once_with(str(tmp_path), "custom_drawers")
|
||||||
|
|
||||||
|
|
||||||
def test_check_extraction_safety_passes_when_sqlite_unreadable_and_under_cap(tmp_path):
|
def test_check_extraction_safety_passes_when_sqlite_unreadable_and_under_cap(tmp_path):
|
||||||
"""SQLite check fails (None) but extraction is well under the cap → safe."""
|
"""SQLite check fails (None) but extraction is well under the cap → safe."""
|
||||||
with patch("mempalace.repair.sqlite_drawer_count", return_value=None):
|
with patch("mempalace.repair.sqlite_drawer_count", return_value=None):
|
||||||
@@ -323,6 +411,73 @@ def test_sqlite_drawer_count_returns_none_on_unreadable_schema(tmp_path):
|
|||||||
assert repair.sqlite_drawer_count(str(tmp_path)) is None
|
assert repair.sqlite_drawer_count(str(tmp_path)) is None
|
||||||
|
|
||||||
|
|
||||||
|
@patch("mempalace.repair.shutil")
|
||||||
|
@patch("mempalace.repair.ChromaBackend")
|
||||||
|
def test_rebuild_index_default_uses_configured_collection(mock_backend_cls, mock_shutil, tmp_path):
|
||||||
|
sqlite_path = tmp_path / "chroma.sqlite3"
|
||||||
|
sqlite3.connect(str(sqlite_path)).close()
|
||||||
|
mock_col = MagicMock()
|
||||||
|
mock_col.count.return_value = 2
|
||||||
|
mock_col.get.return_value = {
|
||||||
|
"ids": ["id1", "id2"],
|
||||||
|
"documents": ["doc1", "doc2"],
|
||||||
|
"metadatas": [{"wing": "a"}, {"wing": "b"}],
|
||||||
|
}
|
||||||
|
mock_temp_col = MagicMock()
|
||||||
|
mock_temp_col.count.return_value = 2
|
||||||
|
mock_new_col = MagicMock()
|
||||||
|
mock_new_col.count.return_value = 2
|
||||||
|
mock_backend = _install_mock_backend(mock_backend_cls, mock_col)
|
||||||
|
mock_backend.create_collection.side_effect = [mock_temp_col, mock_new_col]
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("mempalace.repair._drawers_collection_name", return_value="custom_drawers"),
|
||||||
|
patch("mempalace.repair.sqlite_drawer_count", return_value=2) as count,
|
||||||
|
):
|
||||||
|
repair.rebuild_index(palace_path=str(tmp_path))
|
||||||
|
|
||||||
|
mock_backend.get_collection.assert_called_once_with(str(tmp_path), "custom_drawers")
|
||||||
|
count.assert_called_once_with(str(tmp_path), "custom_drawers")
|
||||||
|
assert mock_backend.create_collection.call_args_list == [
|
||||||
|
call(str(tmp_path), "custom_drawers__repair_tmp"),
|
||||||
|
call(str(tmp_path), "custom_drawers"),
|
||||||
|
]
|
||||||
|
assert mock_backend.delete_collection.call_args_list == [
|
||||||
|
call(str(tmp_path), "custom_drawers__repair_tmp"),
|
||||||
|
call(str(tmp_path), "custom_drawers"),
|
||||||
|
call(str(tmp_path), "custom_drawers__repair_tmp"),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def test_status_default_uses_configured_drawer_collection(tmp_path):
|
||||||
|
with (
|
||||||
|
patch("mempalace.repair._drawers_collection_name", return_value="custom_drawers"),
|
||||||
|
patch("mempalace.repair.hnsw_capacity_status") as capacity_status,
|
||||||
|
):
|
||||||
|
capacity_status.side_effect = [
|
||||||
|
{
|
||||||
|
"sqlite_count": 1,
|
||||||
|
"hnsw_count": 1,
|
||||||
|
"divergence": 0,
|
||||||
|
"diverged": False,
|
||||||
|
"status": "ok",
|
||||||
|
"message": "",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"sqlite_count": 0,
|
||||||
|
"hnsw_count": 0,
|
||||||
|
"divergence": 0,
|
||||||
|
"diverged": False,
|
||||||
|
"status": "ok",
|
||||||
|
"message": "",
|
||||||
|
},
|
||||||
|
]
|
||||||
|
repair.status(palace_path=str(tmp_path))
|
||||||
|
|
||||||
|
assert capacity_status.call_args_list[0].args == (str(tmp_path), "custom_drawers")
|
||||||
|
assert capacity_status.call_args_list[1].args == (str(tmp_path), "mempalace_closets")
|
||||||
|
|
||||||
|
|
||||||
@patch("mempalace.repair.shutil")
|
@patch("mempalace.repair.shutil")
|
||||||
@patch("mempalace.repair.ChromaBackend")
|
@patch("mempalace.repair.ChromaBackend")
|
||||||
def test_rebuild_index_aborts_on_truncation_signal(mock_backend_cls, mock_shutil, tmp_path):
|
def test_rebuild_index_aborts_on_truncation_signal(mock_backend_cls, mock_shutil, tmp_path):
|
||||||
@@ -367,19 +522,261 @@ def test_rebuild_index_proceeds_with_override(mock_backend_cls, mock_shutil, tmp
|
|||||||
},
|
},
|
||||||
{"ids": [], "documents": [], "metadatas": []},
|
{"ids": [], "documents": [], "metadatas": []},
|
||||||
]
|
]
|
||||||
|
mock_temp_col = MagicMock()
|
||||||
|
mock_temp_col.count.return_value = 10_000
|
||||||
mock_new_col = MagicMock()
|
mock_new_col = MagicMock()
|
||||||
|
mock_new_col.count.return_value = 10_000
|
||||||
mock_backend.get_collection.return_value = mock_col
|
mock_backend.get_collection.return_value = mock_col
|
||||||
mock_backend.create_collection.return_value = mock_new_col
|
mock_backend.create_collection.side_effect = [mock_temp_col, mock_new_col]
|
||||||
mock_backend_cls.return_value = mock_backend
|
mock_backend_cls.return_value = mock_backend
|
||||||
|
|
||||||
with patch("mempalace.repair.sqlite_drawer_count", return_value=67_580):
|
with patch("mempalace.repair.sqlite_drawer_count", return_value=67_580):
|
||||||
repair.rebuild_index(palace_path=str(tmp_path), confirm_truncation_ok=True)
|
repair.rebuild_index(palace_path=str(tmp_path), confirm_truncation_ok=True)
|
||||||
|
|
||||||
mock_backend.delete_collection.assert_called_once()
|
assert mock_backend.delete_collection.call_count == 3
|
||||||
mock_backend.create_collection.assert_called_once()
|
assert mock_backend.create_collection.call_count == 2
|
||||||
|
mock_temp_col.upsert.assert_called()
|
||||||
mock_new_col.upsert.assert_called()
|
mock_new_col.upsert.assert_called()
|
||||||
|
|
||||||
|
|
||||||
|
@patch("mempalace.repair.shutil")
|
||||||
|
@patch("mempalace.repair.ChromaBackend")
|
||||||
|
def test_rebuild_index_stage_failure_leaves_live_collection_untouched(
|
||||||
|
mock_backend_cls, mock_shutil, tmp_path
|
||||||
|
):
|
||||||
|
sqlite_path = tmp_path / "chroma.sqlite3"
|
||||||
|
sqlite3.connect(str(sqlite_path)).close()
|
||||||
|
|
||||||
|
mock_col = MagicMock()
|
||||||
|
mock_col.count.return_value = 2
|
||||||
|
mock_col.get.return_value = {
|
||||||
|
"ids": ["id1", "id2"],
|
||||||
|
"documents": ["doc1", "doc2"],
|
||||||
|
"metadatas": [{"wing": "a"}, {"wing": "b"}],
|
||||||
|
}
|
||||||
|
mock_temp_col = MagicMock()
|
||||||
|
mock_temp_col.count.return_value = 1
|
||||||
|
mock_backend = _install_mock_backend(mock_backend_cls, mock_col)
|
||||||
|
mock_backend.create_collection.return_value = mock_temp_col
|
||||||
|
|
||||||
|
with pytest.raises(repair.RebuildCollectionError) as excinfo:
|
||||||
|
repair.rebuild_index(palace_path=str(tmp_path))
|
||||||
|
|
||||||
|
assert excinfo.value.live_replaced is False
|
||||||
|
assert mock_shutil.copy2.call_count == 1
|
||||||
|
assert mock_backend.delete_collection.call_args_list == [
|
||||||
|
call(str(tmp_path), "mempalace_drawers__repair_tmp"),
|
||||||
|
call(str(tmp_path), "mempalace_drawers__repair_tmp"),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
@patch("mempalace.repair.shutil")
|
||||||
|
@patch("mempalace.repair.ChromaBackend")
|
||||||
|
def test_rebuild_index_live_failure_restores_backup(mock_backend_cls, mock_shutil, tmp_path):
|
||||||
|
sqlite_path = tmp_path / "chroma.sqlite3"
|
||||||
|
sqlite3.connect(str(sqlite_path)).close()
|
||||||
|
|
||||||
|
def _fake_copy2(src, dst):
|
||||||
|
with open(dst, "w") as handle:
|
||||||
|
handle.write("backup")
|
||||||
|
|
||||||
|
mock_shutil.copy2.side_effect = _fake_copy2
|
||||||
|
|
||||||
|
mock_col = MagicMock()
|
||||||
|
mock_col.count.return_value = 2
|
||||||
|
mock_col.get.return_value = {
|
||||||
|
"ids": ["id1", "id2"],
|
||||||
|
"documents": ["doc1", "doc2"],
|
||||||
|
"metadatas": [{"wing": "a"}, {"wing": "b"}],
|
||||||
|
}
|
||||||
|
mock_temp_col = MagicMock()
|
||||||
|
mock_temp_col.count.return_value = 2
|
||||||
|
mock_new_col = MagicMock()
|
||||||
|
mock_new_col.upsert.side_effect = RuntimeError("live upsert failed")
|
||||||
|
active_backend = MagicMock()
|
||||||
|
active_backend.get_collection.return_value = mock_col
|
||||||
|
active_backend.create_collection.side_effect = [mock_temp_col, mock_new_col]
|
||||||
|
helper_backend = MagicMock()
|
||||||
|
mock_backend_cls.side_effect = [active_backend, helper_backend]
|
||||||
|
|
||||||
|
with pytest.raises(repair.RebuildCollectionError) as excinfo:
|
||||||
|
repair.rebuild_index(palace_path=str(tmp_path))
|
||||||
|
|
||||||
|
assert excinfo.value.live_replaced is True
|
||||||
|
assert mock_shutil.copy2.call_count == 2
|
||||||
|
assert active_backend.delete_collection.call_args_list == [
|
||||||
|
call(str(tmp_path), "mempalace_drawers__repair_tmp"),
|
||||||
|
call(str(tmp_path), "mempalace_drawers"),
|
||||||
|
call(str(tmp_path), "mempalace_drawers__repair_tmp"),
|
||||||
|
call(str(tmp_path), "mempalace_drawers"),
|
||||||
|
]
|
||||||
|
active_backend.close_palace.assert_called_once_with(str(tmp_path))
|
||||||
|
helper_backend.close_palace.assert_not_called()
|
||||||
|
|
||||||
|
|
||||||
|
@patch("mempalace.repair.shutil")
|
||||||
|
@patch("mempalace.repair.ChromaBackend")
|
||||||
|
def test_rebuild_index_live_delete_missing_still_restores_backup(
|
||||||
|
mock_backend_cls, mock_shutil, tmp_path
|
||||||
|
):
|
||||||
|
sqlite_path = tmp_path / "chroma.sqlite3"
|
||||||
|
sqlite3.connect(str(sqlite_path)).close()
|
||||||
|
|
||||||
|
def _fake_copy2(src, dst):
|
||||||
|
with open(dst, "w") as handle:
|
||||||
|
handle.write("backup")
|
||||||
|
|
||||||
|
mock_shutil.copy2.side_effect = _fake_copy2
|
||||||
|
|
||||||
|
mock_col = MagicMock()
|
||||||
|
mock_col.count.return_value = 2
|
||||||
|
mock_col.get.return_value = {
|
||||||
|
"ids": ["id1", "id2"],
|
||||||
|
"documents": ["doc1", "doc2"],
|
||||||
|
"metadatas": [{"wing": "a"}, {"wing": "b"}],
|
||||||
|
}
|
||||||
|
mock_temp_col = MagicMock()
|
||||||
|
mock_temp_col.count.return_value = 2
|
||||||
|
mock_backend = _install_mock_backend(mock_backend_cls, mock_col)
|
||||||
|
mock_backend.create_collection.side_effect = [mock_temp_col, RuntimeError("create failed")]
|
||||||
|
mock_backend.delete_collection.side_effect = [
|
||||||
|
None,
|
||||||
|
None,
|
||||||
|
None,
|
||||||
|
repair.ChromaNotFoundError("missing"),
|
||||||
|
]
|
||||||
|
|
||||||
|
with pytest.raises(repair.RebuildCollectionError) as excinfo:
|
||||||
|
repair.rebuild_index(palace_path=str(tmp_path))
|
||||||
|
|
||||||
|
assert excinfo.value.live_replaced is True
|
||||||
|
assert mock_shutil.copy2.call_count == 2
|
||||||
|
assert mock_backend.delete_collection.call_args_list == [
|
||||||
|
call(str(tmp_path), "mempalace_drawers__repair_tmp"),
|
||||||
|
call(str(tmp_path), "mempalace_drawers"),
|
||||||
|
call(str(tmp_path), "mempalace_drawers__repair_tmp"),
|
||||||
|
call(str(tmp_path), "mempalace_drawers"),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
@patch("mempalace.repair.shutil")
|
||||||
|
@patch("mempalace.repair.ChromaBackend")
|
||||||
|
def test_rebuild_index_restore_failure_preserves_original_error(
|
||||||
|
mock_backend_cls, mock_shutil, tmp_path, capsys
|
||||||
|
):
|
||||||
|
sqlite_path = tmp_path / "chroma.sqlite3"
|
||||||
|
sqlite3.connect(str(sqlite_path)).close()
|
||||||
|
|
||||||
|
def _copy2_side_effect(src, dst):
|
||||||
|
if str(src).endswith(".backup"):
|
||||||
|
raise PermissionError("locked sqlite")
|
||||||
|
with open(dst, "w") as handle:
|
||||||
|
handle.write("backup")
|
||||||
|
|
||||||
|
mock_shutil.copy2.side_effect = _copy2_side_effect
|
||||||
|
|
||||||
|
mock_col = MagicMock()
|
||||||
|
mock_col.count.return_value = 2
|
||||||
|
mock_col.get.return_value = {
|
||||||
|
"ids": ["id1", "id2"],
|
||||||
|
"documents": ["doc1", "doc2"],
|
||||||
|
"metadatas": [{"wing": "a"}, {"wing": "b"}],
|
||||||
|
}
|
||||||
|
mock_temp_col = MagicMock()
|
||||||
|
mock_temp_col.count.return_value = 2
|
||||||
|
mock_new_col = MagicMock()
|
||||||
|
mock_new_col.upsert.side_effect = RuntimeError("live upsert failed")
|
||||||
|
mock_backend = _install_mock_backend(mock_backend_cls, mock_col)
|
||||||
|
mock_backend.create_collection.side_effect = [mock_temp_col, mock_new_col]
|
||||||
|
|
||||||
|
with pytest.raises(repair.RebuildCollectionError) as excinfo:
|
||||||
|
repair.rebuild_index(palace_path=str(tmp_path))
|
||||||
|
|
||||||
|
out = capsys.readouterr().out
|
||||||
|
assert "locked sqlite" in out
|
||||||
|
assert "Manual restore required" in out
|
||||||
|
assert "live upsert failed" in str(excinfo.value)
|
||||||
|
|
||||||
|
|
||||||
|
@patch("mempalace.repair.ChromaBackend")
|
||||||
|
def test_rebuild_collection_via_temp_keeps_original_error_when_cleanup_fails(
|
||||||
|
mock_backend_cls,
|
||||||
|
):
|
||||||
|
mock_col = MagicMock()
|
||||||
|
mock_col.count.return_value = 2
|
||||||
|
mock_temp_col = MagicMock()
|
||||||
|
mock_temp_col.count.return_value = 2
|
||||||
|
mock_backend = _install_mock_backend(mock_backend_cls, mock_col)
|
||||||
|
mock_backend.create_collection.side_effect = [mock_temp_col, RuntimeError("live build failed")]
|
||||||
|
mock_backend.delete_collection.side_effect = [
|
||||||
|
None,
|
||||||
|
None,
|
||||||
|
RuntimeError("cleanup failed"),
|
||||||
|
]
|
||||||
|
|
||||||
|
with pytest.raises(repair.RebuildCollectionError) as excinfo:
|
||||||
|
repair._rebuild_collection_via_temp(
|
||||||
|
mock_backend,
|
||||||
|
"/palace",
|
||||||
|
["id1", "id2"],
|
||||||
|
["doc1", "doc2"],
|
||||||
|
[{"wing": "a"}, {"wing": "b"}],
|
||||||
|
batch_size=5000,
|
||||||
|
progress=lambda *args, **kwargs: None,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert "live build failed" in str(excinfo.value)
|
||||||
|
assert excinfo.value.live_replaced is True
|
||||||
|
assert mock_backend.delete_collection.call_args_list == [
|
||||||
|
call("/palace", "mempalace_drawers__repair_tmp"),
|
||||||
|
call("/palace", "mempalace_drawers"),
|
||||||
|
call("/palace", "mempalace_drawers__repair_tmp"),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
@patch("mempalace.repair.shutil")
|
||||||
|
@patch("mempalace.repair.ChromaBackend")
|
||||||
|
def test_rebuild_index_ignores_temp_cleanup_failure_after_success(
|
||||||
|
mock_backend_cls, mock_shutil, tmp_path
|
||||||
|
):
|
||||||
|
sqlite_path = tmp_path / "chroma.sqlite3"
|
||||||
|
sqlite3.connect(str(sqlite_path)).close()
|
||||||
|
|
||||||
|
def _fake_copy2(src, dst):
|
||||||
|
with open(dst, "w") as handle:
|
||||||
|
handle.write("backup")
|
||||||
|
|
||||||
|
mock_shutil.copy2.side_effect = _fake_copy2
|
||||||
|
|
||||||
|
mock_col = MagicMock()
|
||||||
|
mock_col.count.return_value = 2
|
||||||
|
mock_col.get.return_value = {
|
||||||
|
"ids": ["id1", "id2"],
|
||||||
|
"documents": ["doc1", "doc2"],
|
||||||
|
"metadatas": [{"wing": "a"}, {"wing": "b"}],
|
||||||
|
}
|
||||||
|
mock_temp_col = MagicMock()
|
||||||
|
mock_temp_col.count.return_value = 2
|
||||||
|
mock_new_col = MagicMock()
|
||||||
|
mock_new_col.count.return_value = 2
|
||||||
|
mock_backend = _install_mock_backend(mock_backend_cls, mock_col)
|
||||||
|
mock_backend.create_collection.side_effect = [mock_temp_col, mock_new_col]
|
||||||
|
mock_backend.delete_collection.side_effect = [
|
||||||
|
None,
|
||||||
|
None,
|
||||||
|
RuntimeError("cleanup failed"),
|
||||||
|
]
|
||||||
|
|
||||||
|
repair.rebuild_index(palace_path=str(tmp_path))
|
||||||
|
|
||||||
|
assert mock_shutil.copy2.call_count == 1
|
||||||
|
assert mock_backend.delete_collection.call_args_list == [
|
||||||
|
call(str(tmp_path), "mempalace_drawers__repair_tmp"),
|
||||||
|
call(str(tmp_path), "mempalace_drawers"),
|
||||||
|
call(str(tmp_path), "mempalace_drawers__repair_tmp"),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
# ── repair_max_seq_id ─────────────────────────────────────────────────
|
# ── repair_max_seq_id ─────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
@@ -754,3 +1151,387 @@ def test_rebuild_index_aborts_on_sqlite_integrity_errors_before_delete_collectio
|
|||||||
mock_backend.delete_collection.assert_not_called()
|
mock_backend.delete_collection.assert_not_called()
|
||||||
mock_backend.create_collection.assert_not_called()
|
mock_backend.create_collection.assert_not_called()
|
||||||
mock_shutil.copy2.assert_not_called()
|
mock_shutil.copy2.assert_not_called()
|
||||||
|
|
||||||
|
|
||||||
|
# ── extract_via_sqlite + rebuild_from_sqlite (#1308) ──────────────────
|
||||||
|
#
|
||||||
|
# These tests build real chromadb palaces in tmp_path rather than mocking
|
||||||
|
# the SQLite layer. The bug class they guard against is "extraction sees
|
||||||
|
# different rows than chromadb stored" — the only honest check is to let
|
||||||
|
# chromadb actually write rows and then read them back via the SQLite
|
||||||
|
# bypass. Mocking the SQLite cursor would defeat the test.
|
||||||
|
|
||||||
|
|
||||||
|
def _seed_palace(palace_path, collection_name, rows):
|
||||||
|
"""Build a real chromadb palace at ``palace_path`` and add ``rows``.
|
||||||
|
|
||||||
|
``rows`` is a list of ``(id, document, metadata)`` tuples.
|
||||||
|
"""
|
||||||
|
from mempalace.backends.chroma import ChromaBackend
|
||||||
|
|
||||||
|
backend = ChromaBackend()
|
||||||
|
try:
|
||||||
|
col = backend.create_collection(str(palace_path), collection_name)
|
||||||
|
col.upsert(
|
||||||
|
ids=[r[0] for r in rows],
|
||||||
|
documents=[r[1] for r in rows],
|
||||||
|
metadatas=[r[2] for r in rows],
|
||||||
|
)
|
||||||
|
finally:
|
||||||
|
# Release chromadb's rust-side SQLite/HNSW file locks before the
|
||||||
|
# caller proceeds. Without this, an in-place rebuild on Windows
|
||||||
|
# fails with WinError 32 on data_level0.bin during the archive
|
||||||
|
# rename (cf. PR #1310 test-windows job).
|
||||||
|
backend.close()
|
||||||
|
|
||||||
|
|
||||||
|
def test_extract_via_sqlite_returns_all_rows_with_metadata(tmp_path):
|
||||||
|
"""Round-trip: a chromadb palace with N upserted rows returns those
|
||||||
|
same N rows when read via the SQLite bypass.
|
||||||
|
|
||||||
|
Catches: anyone who breaks the segments/embeddings/embedding_metadata
|
||||||
|
JOIN, swaps the metadata vs vector segment, or changes how the
|
||||||
|
document is stored under the ``chroma:document`` key.
|
||||||
|
|
||||||
|
Also asserts every embedding row underlying the extraction lives in
|
||||||
|
a ``segments.scope = 'METADATA'`` segment. Document + metadata rows
|
||||||
|
are stored under METADATA in Chroma's segment layout while HNSW
|
||||||
|
files live under ``VECTOR``; locking that assumption in here means a
|
||||||
|
future refactor that accidentally points the JOIN at ``VECTOR``
|
||||||
|
fails this test instead of silently regressing the recovery path.
|
||||||
|
"""
|
||||||
|
rows = [
|
||||||
|
(f"drawer_{i:03d}", f"document body {i}", {"wing": "test_wing", "room": f"r{i % 3}"})
|
||||||
|
for i in range(25)
|
||||||
|
]
|
||||||
|
_seed_palace(tmp_path, "mempalace_drawers", rows)
|
||||||
|
|
||||||
|
extracted = list(repair.extract_via_sqlite(str(tmp_path), "mempalace_drawers"))
|
||||||
|
|
||||||
|
assert len(extracted) == 25
|
||||||
|
by_id = {emb_id: (doc, meta) for emb_id, doc, meta in extracted}
|
||||||
|
assert set(by_id) == {r[0] for r in rows}
|
||||||
|
for emb_id, doc, meta in rows:
|
||||||
|
got_doc, got_meta = by_id[emb_id]
|
||||||
|
assert got_doc == doc, f"document mangled for {emb_id}"
|
||||||
|
assert got_meta == meta, f"metadata mangled for {emb_id}: {got_meta!r}"
|
||||||
|
|
||||||
|
# Lock the segment-scope assumption directly against Chroma's on-disk
|
||||||
|
# layout so a future change that points the extraction JOIN at the
|
||||||
|
# VECTOR segment cannot pass this test. Query each extracted row's
|
||||||
|
# backing segment scope via the same SQLite tables ``extract_via_sqlite``
|
||||||
|
# reads from.
|
||||||
|
sqlite_path = os.path.join(str(tmp_path), "chroma.sqlite3")
|
||||||
|
conn = sqlite3.connect(f"file:{sqlite_path}?mode=ro", uri=True)
|
||||||
|
try:
|
||||||
|
scopes = {
|
||||||
|
scope
|
||||||
|
for (scope,) in conn.execute(
|
||||||
|
"""
|
||||||
|
SELECT DISTINCT s.scope
|
||||||
|
FROM embeddings e
|
||||||
|
JOIN segments s ON e.segment_id = s.id
|
||||||
|
JOIN collections c ON s.collection = c.id
|
||||||
|
WHERE c.name = ? AND e.embedding_id IN ({})
|
||||||
|
""".format(",".join("?" * len(extracted))),
|
||||||
|
("mempalace_drawers", *(emb_id for emb_id, _, _ in extracted)),
|
||||||
|
)
|
||||||
|
}
|
||||||
|
finally:
|
||||||
|
conn.close()
|
||||||
|
assert scopes == {"METADATA"}, (
|
||||||
|
f"extraction is reading from segments scoped {scopes!r}; only "
|
||||||
|
"'METADATA' should back the document/metadata rows. If Chroma's "
|
||||||
|
"segment layout changed, update extract_via_sqlite's WHERE clause."
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_extract_via_sqlite_preserves_typed_metadata(tmp_path):
|
||||||
|
"""Chromadb stores int / float / bool / string in distinct typed
|
||||||
|
columns. Extraction must round-trip the original type, not coerce
|
||||||
|
everything to string.
|
||||||
|
|
||||||
|
Catches: a regression where the SELECT order changes and ints come
|
||||||
|
back as None, or where the column-resolution rule prefers the wrong
|
||||||
|
column.
|
||||||
|
"""
|
||||||
|
rows = [
|
||||||
|
(
|
||||||
|
"drawer_typed",
|
||||||
|
"doc",
|
||||||
|
{
|
||||||
|
"wing": "w",
|
||||||
|
"chunk_index": 7, # int
|
||||||
|
"score": 0.42, # float
|
||||||
|
"is_active": True, # bool
|
||||||
|
},
|
||||||
|
),
|
||||||
|
]
|
||||||
|
_seed_palace(tmp_path, "mempalace_drawers", rows)
|
||||||
|
|
||||||
|
extracted = list(repair.extract_via_sqlite(str(tmp_path), "mempalace_drawers"))
|
||||||
|
assert len(extracted) == 1
|
||||||
|
_, _, meta = extracted[0]
|
||||||
|
|
||||||
|
assert meta["chunk_index"] == 7 and isinstance(meta["chunk_index"], int)
|
||||||
|
assert meta["score"] == 0.42 and isinstance(meta["score"], float)
|
||||||
|
assert meta["is_active"] is True
|
||||||
|
assert meta["wing"] == "w"
|
||||||
|
|
||||||
|
|
||||||
|
def test_extract_via_sqlite_unknown_collection_yields_nothing(tmp_path):
|
||||||
|
"""Asking for a collection that isn't in the palace must return an
|
||||||
|
empty iterator, not silently fall back to another collection's
|
||||||
|
metadata segment. Seeds two real collections and queries for a third
|
||||||
|
name so a regression that drops the WHERE c.name=? filter would leak
|
||||||
|
rows from the seeded collections rather than passing.
|
||||||
|
"""
|
||||||
|
_seed_palace(tmp_path, "mempalace_drawers", [("d1", "doc", {"wing": "w"})])
|
||||||
|
_seed_palace(tmp_path, "mempalace_closets", [("c1", "abbrev", {"wing": "w"})])
|
||||||
|
assert list(repair.extract_via_sqlite(str(tmp_path), "not_a_real_collection")) == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_extract_via_sqlite_missing_palace_yields_nothing(tmp_path):
|
||||||
|
"""No chroma.sqlite3 → empty iterator, no exception. Callers depend
|
||||||
|
on this when probing speculatively."""
|
||||||
|
empty = tmp_path / "no_palace_here"
|
||||||
|
empty.mkdir()
|
||||||
|
assert list(repair.extract_via_sqlite(str(empty), "mempalace_drawers")) == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_rebuild_from_sqlite_roundtrips_via_real_chromadb(tmp_path):
|
||||||
|
"""End-to-end: seed source palace, rebuild into a fresh dest, then
|
||||||
|
open dest with a fresh ChromaBackend and verify ``count()`` and
|
||||||
|
metadata filters return the original rows. Also asserts a closet
|
||||||
|
document round-trips so a future regression that re-embeds with the
|
||||||
|
wrong EF or swaps drawer/closet content would fail here.
|
||||||
|
|
||||||
|
This is the single most important regression guard. If
|
||||||
|
``rebuild_from_sqlite`` silently drops rows or mangles metadata, no
|
||||||
|
other test in this file would catch it because they all stop at the
|
||||||
|
extraction layer.
|
||||||
|
"""
|
||||||
|
from mempalace.backends.chroma import ChromaBackend
|
||||||
|
|
||||||
|
source = tmp_path / "source"
|
||||||
|
dest = tmp_path / "dest"
|
||||||
|
|
||||||
|
rows = [
|
||||||
|
(f"drawer_{i:03d}", f"body {i}", {"wing": "alpha" if i % 2 else "beta", "room": "r0"})
|
||||||
|
for i in range(40)
|
||||||
|
]
|
||||||
|
_seed_palace(source, "mempalace_drawers", rows)
|
||||||
|
_seed_palace(
|
||||||
|
source,
|
||||||
|
"mempalace_closets",
|
||||||
|
[("closet_x", "abbrev pointer →drawer_001", {"wing": "alpha"})],
|
||||||
|
)
|
||||||
|
|
||||||
|
counts = repair.rebuild_from_sqlite(str(source), str(dest))
|
||||||
|
assert counts == {"mempalace_drawers": 40, "mempalace_closets": 1}
|
||||||
|
|
||||||
|
backend = ChromaBackend()
|
||||||
|
drawers = backend.get_collection(str(dest), "mempalace_drawers")
|
||||||
|
assert drawers.count() == 40
|
||||||
|
alpha = drawers.get(where={"wing": "alpha"})
|
||||||
|
assert len(alpha["ids"]) == 20
|
||||||
|
|
||||||
|
# Spot-check that document text round-trips for one specific drawer
|
||||||
|
# — protects against a regression where extraction or upsert order
|
||||||
|
# silently swaps document bodies between IDs.
|
||||||
|
one = drawers.get(ids=["drawer_007"], include=["documents", "metadatas"])
|
||||||
|
assert one["documents"] == ["body 7"]
|
||||||
|
assert one["metadatas"][0]["wing"] == "alpha"
|
||||||
|
|
||||||
|
# Closets: the AAAK index layer. Re-embedded with the same EF so a
|
||||||
|
# known closet ID and its document body must come back intact.
|
||||||
|
closets = backend.get_collection(str(dest), "mempalace_closets")
|
||||||
|
assert closets.count() == 1
|
||||||
|
closet_row = closets.get(ids=["closet_x"], include=["documents", "metadatas"])
|
||||||
|
assert closet_row["documents"] == ["abbrev pointer →drawer_001"]
|
||||||
|
assert closet_row["metadatas"][0] == {"wing": "alpha"}
|
||||||
|
|
||||||
|
|
||||||
|
def test_rebuild_from_sqlite_refuses_existing_dest(tmp_path):
|
||||||
|
"""Refuse to write into a directory that already exists when source
|
||||||
|
and dest differ. Without this, an unattended re-run would silently
|
||||||
|
interleave a partial rebuild with whatever's already at dest.
|
||||||
|
"""
|
||||||
|
source = tmp_path / "source"
|
||||||
|
dest = tmp_path / "dest"
|
||||||
|
_seed_palace(source, "mempalace_drawers", [("d1", "doc", {"wing": "w"})])
|
||||||
|
dest.mkdir()
|
||||||
|
# Drop a marker file so we can prove the dir wasn't touched.
|
||||||
|
(dest / "marker.txt").write_text("preexisting")
|
||||||
|
|
||||||
|
counts = repair.rebuild_from_sqlite(str(source), str(dest))
|
||||||
|
assert counts == {}
|
||||||
|
assert (dest / "marker.txt").read_text() == "preexisting"
|
||||||
|
assert not (dest / "chroma.sqlite3").exists()
|
||||||
|
|
||||||
|
|
||||||
|
def test_rebuild_from_sqlite_in_place_archives_when_opted_in(tmp_path):
|
||||||
|
"""In-place rebuild (source == dest) with ``archive_existing_dest=True``
|
||||||
|
must move the original aside to ``<dest>.pre-rebuild-<ts>`` and read
|
||||||
|
from the archive — the original drawer rows must survive in the new
|
||||||
|
palace, AND the archive itself must still contain the original rows.
|
||||||
|
|
||||||
|
Catches: a refactor that moves the original out but then reads from
|
||||||
|
the now-empty original location, producing an empty rebuild; also
|
||||||
|
catches a swap that empties the archive after reading.
|
||||||
|
"""
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
rows = [(f"d{i}", f"body {i}", {"wing": "w", "room": "r"}) for i in range(15)]
|
||||||
|
_seed_palace(palace, "mempalace_drawers", rows)
|
||||||
|
|
||||||
|
counts = repair.rebuild_from_sqlite(str(palace), str(palace), archive_existing_dest=True)
|
||||||
|
assert counts["mempalace_drawers"] == 15
|
||||||
|
|
||||||
|
archives = [p for p in tmp_path.iterdir() if p.name.startswith("palace.pre-rebuild-")]
|
||||||
|
assert len(archives) == 1
|
||||||
|
assert (archives[0] / "chroma.sqlite3").exists()
|
||||||
|
# Archive must still hold the same row count via the SQLite bypass —
|
||||||
|
# proves the archive wasn't silently truncated as a side effect.
|
||||||
|
archived_rows = list(repair.extract_via_sqlite(str(archives[0]), "mempalace_drawers"))
|
||||||
|
assert len(archived_rows) == 15
|
||||||
|
|
||||||
|
from mempalace.backends.chroma import ChromaBackend
|
||||||
|
|
||||||
|
rebuilt = ChromaBackend().get_collection(str(palace), "mempalace_drawers")
|
||||||
|
assert rebuilt.count() == 15
|
||||||
|
|
||||||
|
|
||||||
|
def test_rebuild_from_sqlite_in_place_refuses_without_archive_flag(tmp_path):
|
||||||
|
"""Source == dest without archive flag must abort untouched. The
|
||||||
|
most catastrophic possible regression of this code path is silently
|
||||||
|
deleting the only copy of the user's data."""
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
_seed_palace(palace, "mempalace_drawers", [("d1", "doc", {"wing": "w"})])
|
||||||
|
sqlite_before = (palace / "chroma.sqlite3").stat().st_size
|
||||||
|
|
||||||
|
counts = repair.rebuild_from_sqlite(str(palace), str(palace))
|
||||||
|
assert counts == {}
|
||||||
|
# Same file, untouched.
|
||||||
|
assert (palace / "chroma.sqlite3").stat().st_size == sqlite_before
|
||||||
|
archives = [p for p in tmp_path.iterdir() if "pre-rebuild" in p.name]
|
||||||
|
assert archives == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_rebuild_from_sqlite_source_missing_chroma_db(tmp_path):
|
||||||
|
"""Source dir exists but has no chroma.sqlite3 → returns empty,
|
||||||
|
leaves dest untouched."""
|
||||||
|
source = tmp_path / "source"
|
||||||
|
source.mkdir()
|
||||||
|
(source / "stray_file").write_text("not a palace")
|
||||||
|
dest = tmp_path / "dest"
|
||||||
|
|
||||||
|
counts = repair.rebuild_from_sqlite(str(source), str(dest))
|
||||||
|
assert counts == {}
|
||||||
|
assert not dest.exists()
|
||||||
|
|
||||||
|
|
||||||
|
def test_rebuild_from_sqlite_in_place_validates_source_before_archiving(tmp_path):
|
||||||
|
"""In-place + archive_existing_dest=True with a dir that lacks
|
||||||
|
chroma.sqlite3 must NOT rename the dir before bailing. An earlier
|
||||||
|
revision archived first and validated second, leaving the user with
|
||||||
|
a renamed empty dir to manually undo. Catches that ordering bug.
|
||||||
|
"""
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
palace.mkdir()
|
||||||
|
(palace / "marker.txt").write_text("not a real palace")
|
||||||
|
|
||||||
|
counts = repair.rebuild_from_sqlite(str(palace), str(palace), archive_existing_dest=True)
|
||||||
|
assert counts == {}
|
||||||
|
# No archive created — original dir still in place with its marker.
|
||||||
|
assert palace.exists()
|
||||||
|
assert (palace / "marker.txt").read_text() == "not a real palace"
|
||||||
|
archives = [p for p in tmp_path.iterdir() if "pre-rebuild" in p.name]
|
||||||
|
assert archives == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_rebuild_from_sqlite_raises_on_upsert_failure(tmp_path, monkeypatch):
|
||||||
|
"""Mid-batch upsert failure must raise ``RebuildPartialError`` and
|
||||||
|
surface the failed collection + archive path so the user can recover.
|
||||||
|
Without this, an unattended script gets exit-code-zero on a partial
|
||||||
|
rebuild and the user discovers the data loss only when search starts
|
||||||
|
returning fewer hits.
|
||||||
|
"""
|
||||||
|
palace = tmp_path / "palace"
|
||||||
|
rows = [(f"d{i}", f"body {i}", {"wing": "w", "room": "r"}) for i in range(5)]
|
||||||
|
_seed_palace(palace, "mempalace_drawers", rows)
|
||||||
|
|
||||||
|
# Make the very first upsert raise so we don't depend on batch
|
||||||
|
# boundary behavior. Patching ChromaCollection.upsert (the wrapper
|
||||||
|
# mempalace's backend returns) keeps the failure path realistic.
|
||||||
|
# ``monkeypatch`` is pytest's built-in fixture that auto-restores
|
||||||
|
# the original attribute when the test exits, so we don't need to
|
||||||
|
# undo this manually.
|
||||||
|
from mempalace.backends.chroma import ChromaCollection
|
||||||
|
|
||||||
|
def boom(self, **kwargs):
|
||||||
|
raise RuntimeError("simulated chromadb upsert failure")
|
||||||
|
|
||||||
|
monkeypatch.setattr(ChromaCollection, "upsert", boom)
|
||||||
|
|
||||||
|
with pytest.raises(repair.RebuildPartialError) as excinfo:
|
||||||
|
repair.rebuild_from_sqlite(str(palace), str(palace), archive_existing_dest=True)
|
||||||
|
|
||||||
|
err = excinfo.value
|
||||||
|
assert err.failed_collection == "mempalace_drawers"
|
||||||
|
assert err.partial_counts.get("mempalace_drawers") == 0
|
||||||
|
assert err.archive_path is not None
|
||||||
|
assert os.path.isfile(os.path.join(err.archive_path, "chroma.sqlite3"))
|
||||||
|
assert err.dest_palace == os.path.abspath(str(palace))
|
||||||
|
|
||||||
|
|
||||||
|
def test_rebuild_from_sqlite_honors_configured_drawer_collection_name(tmp_path, monkeypatch):
|
||||||
|
"""A user with a non-default drawers collection name (set via
|
||||||
|
``MempalaceConfig().collection_name``) must have THAT collection
|
||||||
|
rebuilt — not the hardcoded ``mempalace_drawers``.
|
||||||
|
|
||||||
|
Catches: a regression where the recovery path silently rebuilds the
|
||||||
|
default-name collection on a custom-named palace, leaving the user's
|
||||||
|
actual data unrebuilt while reporting "rebuild complete." This is
|
||||||
|
the failure mode reviewer mjc flagged on PR #1310 as needing to line
|
||||||
|
up with the configured-collection-name work in #1312. Closets stay
|
||||||
|
fixed (``mempalace_closets``) by design — the AAAK index references
|
||||||
|
drawer IDs by string and is not per-deployment configurable.
|
||||||
|
|
||||||
|
Strategy: monkeypatch the lazy resolver so the test is hermetic and
|
||||||
|
does not depend on the global config file or env state.
|
||||||
|
"""
|
||||||
|
from mempalace.backends.chroma import ChromaBackend
|
||||||
|
|
||||||
|
custom_drawers = "custom_drawers_xyz"
|
||||||
|
monkeypatch.setattr(repair, "_drawers_collection_name", lambda: custom_drawers)
|
||||||
|
|
||||||
|
source = tmp_path / "source"
|
||||||
|
dest = tmp_path / "dest"
|
||||||
|
|
||||||
|
drawer_rows = [(f"d{i}", f"body {i}", {"wing": "alpha"}) for i in range(3)]
|
||||||
|
closet_rows = [("closet_a", "abbrev →d0", {"wing": "alpha"})]
|
||||||
|
_seed_palace(source, custom_drawers, drawer_rows)
|
||||||
|
_seed_palace(source, "mempalace_closets", closet_rows)
|
||||||
|
|
||||||
|
counts = repair.rebuild_from_sqlite(str(source), str(dest))
|
||||||
|
|
||||||
|
# Rebuilt under the custom name, not under the default "mempalace_drawers".
|
||||||
|
assert counts == {custom_drawers: 3, "mempalace_closets": 1}
|
||||||
|
|
||||||
|
backend = ChromaBackend()
|
||||||
|
rebuilt_drawers = backend.get_collection(str(dest), custom_drawers)
|
||||||
|
assert rebuilt_drawers.count() == 3
|
||||||
|
|
||||||
|
# Default-name collection must NOT exist in dest — proves we did not
|
||||||
|
# silently fall back to the hardcoded name during rebuild.
|
||||||
|
try:
|
||||||
|
rebuilt_default = backend.get_collection(str(dest), "mempalace_drawers")
|
||||||
|
# If get_collection returns without raising, count() should be 0
|
||||||
|
# (chromadb may auto-create on get with some EFs); a non-zero
|
||||||
|
# count would mean we wrote rows to the wrong collection.
|
||||||
|
assert rebuilt_default.count() == 0, (
|
||||||
|
"rebuild leaked rows into the default-name collection on a "
|
||||||
|
"custom-name palace — recovery wrote to the wrong collection."
|
||||||
|
)
|
||||||
|
except Exception:
|
||||||
|
pass # Expected: collection wasn't created.
|
||||||
|
|||||||
+78
-1
@@ -84,6 +84,24 @@ class TestSearchMemories:
|
|||||||
assert "error" in result
|
assert "error" in result
|
||||||
assert "query failed" in result["error"]
|
assert "query failed" in result["error"]
|
||||||
|
|
||||||
|
def test_search_memories_vector_path_uses_explicit_collection_name(self):
|
||||||
|
mock_col = MagicMock()
|
||||||
|
mock_col.query.return_value = {
|
||||||
|
"documents": [[]],
|
||||||
|
"metadatas": [[]],
|
||||||
|
"distances": [[]],
|
||||||
|
"ids": [[]],
|
||||||
|
}
|
||||||
|
|
||||||
|
with patch("mempalace.searcher.get_collection", return_value=mock_col) as get_collection:
|
||||||
|
search_memories("test", "/fake/path", collection_name="custom_drawers")
|
||||||
|
|
||||||
|
get_collection.assert_called_once_with(
|
||||||
|
"/fake/path",
|
||||||
|
collection_name="custom_drawers",
|
||||||
|
create=False,
|
||||||
|
)
|
||||||
|
|
||||||
def test_search_memories_filters_in_result(self, palace_path, seeded_collection):
|
def test_search_memories_filters_in_result(self, palace_path, seeded_collection):
|
||||||
result = search_memories("test", palace_path, wing="project", room="backend")
|
result = search_memories("test", palace_path, wing="project", room="backend")
|
||||||
assert result["filters"]["wing"] == "project"
|
assert result["filters"]["wing"] == "project"
|
||||||
@@ -102,7 +120,7 @@ class TestSearchMemories:
|
|||||||
"ids": [["d1", "d2"]],
|
"ids": [["d1", "d2"]],
|
||||||
}
|
}
|
||||||
|
|
||||||
def mock_get_collection(path, create=False):
|
def mock_get_collection(path, collection_name=None, create=False):
|
||||||
# First call: drawers. Second call: closets — raise so hybrid
|
# First call: drawers. Second call: closets — raise so hybrid
|
||||||
# degrades to pure drawer search (the catch block covers it).
|
# degrades to pure drawer search (the catch block covers it).
|
||||||
if not hasattr(mock_get_collection, "_called"):
|
if not hasattr(mock_get_collection, "_called"):
|
||||||
@@ -120,6 +138,65 @@ class TestSearchMemories:
|
|||||||
assert none_hit["wing"] == "unknown"
|
assert none_hit["wing"] == "unknown"
|
||||||
assert none_hit["room"] == "unknown"
|
assert none_hit["room"] == "unknown"
|
||||||
|
|
||||||
|
def test_effective_distance_clamped_to_valid_cosine_range(self):
|
||||||
|
"""A strong closet boost (up to 0.40) applied to a low-distance drawer
|
||||||
|
can drive ``dist - boost`` negative. That violates the cosine-distance
|
||||||
|
invariant ``[0, 2]``: the API returns ``similarity > 1.0`` and the
|
||||||
|
internal ``_sort_key`` sinks below ordinary positive distances,
|
||||||
|
inverting the ranking so the best hybrid matches sort last.
|
||||||
|
|
||||||
|
With the clamp, ``effective_distance`` stays in ``[0, 2]``,
|
||||||
|
``similarity`` stays in ``[0, 1]``, and the sort order is stable.
|
||||||
|
"""
|
||||||
|
# Drawer a.md gets a tiny base distance (0.08) — nearly exact match.
|
||||||
|
# Drawer b.md gets a larger base distance (0.35).
|
||||||
|
drawers_col = MagicMock()
|
||||||
|
drawers_col.query.return_value = {
|
||||||
|
"documents": [["doc-a", "doc-b"]],
|
||||||
|
"metadatas": [
|
||||||
|
[
|
||||||
|
{"source_file": "a.md", "wing": "w", "room": "r", "chunk_index": 0},
|
||||||
|
{"source_file": "b.md", "wing": "w", "room": "r", "chunk_index": 0},
|
||||||
|
]
|
||||||
|
],
|
||||||
|
"distances": [[0.08, 0.35]],
|
||||||
|
"ids": [["d-a", "d-b"]],
|
||||||
|
}
|
||||||
|
# A strong closet at rank 0 points at a.md → boost = 0.40,
|
||||||
|
# which exceeds a.md's base distance and would go negative without
|
||||||
|
# the clamp. No closet for b.md.
|
||||||
|
closets_col = MagicMock()
|
||||||
|
closets_col.query.return_value = {
|
||||||
|
"documents": [["closet-preview-a"]],
|
||||||
|
"metadatas": [[{"source_file": "a.md"}]],
|
||||||
|
"distances": [[0.2]], # within CLOSET_DISTANCE_CAP (1.5)
|
||||||
|
"ids": [["c-a"]],
|
||||||
|
}
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("mempalace.searcher.get_collection", return_value=drawers_col),
|
||||||
|
patch("mempalace.searcher.get_closets_collection", return_value=closets_col),
|
||||||
|
):
|
||||||
|
result = search_memories("query", "/fake/path", n_results=5)
|
||||||
|
|
||||||
|
hits = result["results"]
|
||||||
|
assert hits, "should return results"
|
||||||
|
|
||||||
|
# Invariants on every hit.
|
||||||
|
for h in hits:
|
||||||
|
assert (
|
||||||
|
0.0 <= h["similarity"] <= 1.0
|
||||||
|
), f"similarity out of range: {h['similarity']} for {h['source_file']}"
|
||||||
|
assert 0.0 <= h["effective_distance"] <= 2.0, (
|
||||||
|
f"effective_distance out of range: {h['effective_distance']} "
|
||||||
|
f"for {h['source_file']}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# With the clamp, the closet-boosted a.md still ranks ahead of b.md —
|
||||||
|
# the boost still wins, but it no longer flips the ranking.
|
||||||
|
assert hits[0]["source_file"] == "a.md"
|
||||||
|
assert hits[0]["matched_via"] == "drawer+closet"
|
||||||
|
|
||||||
|
|
||||||
# ── BM25 internals: None / empty document safety ─────────────────────
|
# ── BM25 internals: None / empty document safety ─────────────────────
|
||||||
|
|
||||||
|
|||||||
Executable
+39
@@ -0,0 +1,39 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
# backup_claude_jsonls.sh
|
||||||
|
#
|
||||||
|
# Claude Code stores every conversation as a JSONL transcript at
|
||||||
|
# ~/.claude/projects/<encoded-project>/<session-uuid>.jsonl
|
||||||
|
# Anthropic auto-deletes those files after 30 DAYS:
|
||||||
|
# https://docs.claude.com/en/docs/claude-code/data-usage
|
||||||
|
#
|
||||||
|
# This script copies them, read-only, into ~/Documents/Claude_JSONL_Backup/
|
||||||
|
# so the 30-day clock no longer applies. Re-run any time — rsync is incremental.
|
||||||
|
# It NEVER deletes, modifies, or touches files inside ~/.claude/.
|
||||||
|
|
||||||
|
set -eu
|
||||||
|
|
||||||
|
SRC="${HOME}/.claude/projects/"
|
||||||
|
DST="${HOME}/Documents/Claude_JSONL_Backup/"
|
||||||
|
|
||||||
|
[ -d "$SRC" ] || { echo "ERROR: $SRC does not exist."; exit 1; }
|
||||||
|
mkdir -p "$DST"
|
||||||
|
|
||||||
|
echo "Backing up $SRC -> $DST"
|
||||||
|
rsync -a --times "$SRC" "$DST"
|
||||||
|
|
||||||
|
src_count=$(find "$SRC" -type f -name '*.jsonl' | wc -l | tr -d ' ')
|
||||||
|
dst_count=$(find "$DST" -type f -name '*.jsonl' | wc -l | tr -d ' ')
|
||||||
|
oldest=$(find "$DST" -type f -name '*.jsonl' -exec stat -f '%Sm %N' -t '%Y-%m-%d' {} \; 2>/dev/null \
|
||||||
|
|| find "$DST" -type f -name '*.jsonl' -printf '%TY-%Tm-%Td %p\n' 2>/dev/null)
|
||||||
|
oldest_date=$(echo "$oldest" | sort | head -n 1 | awk '{print $1}')
|
||||||
|
newest_date=$(echo "$oldest" | sort | tail -n 1 | awk '{print $1}')
|
||||||
|
|
||||||
|
echo "Source JSONL count : $src_count"
|
||||||
|
echo "Backup JSONL count : $dst_count"
|
||||||
|
echo "Oldest backup file : ${oldest_date:-n/a}"
|
||||||
|
echo "Newest backup file : ${newest_date:-n/a}"
|
||||||
|
|
||||||
|
if [ "$src_count" -ne "$dst_count" ]; then
|
||||||
|
echo "FAIL: count mismatch ($src_count vs $dst_count)"; exit 2
|
||||||
|
fi
|
||||||
|
echo "OK: backup verified."
|
||||||
Executable
+115
@@ -0,0 +1,115 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
# find_orphan_claude_jsonls.sh — v3 (multi-line shape + verb-aware preview)
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# Finds Claude Code conversation transcripts (.jsonl) that may have survived in
|
||||||
|
# backup/sync locations. Claude Code stores transcripts at
|
||||||
|
# ~/.claude/projects/<encoded>/<session>.jsonl and auto-deletes them locally
|
||||||
|
# after 30 days. If your machine syncs to iCloud, Dropbox, Google Drive,
|
||||||
|
# OneDrive, Time Machine, or you copied transcripts elsewhere manually, those
|
||||||
|
# copies still exist. This script finds them and shows a topic preview from
|
||||||
|
# the first substantive user message — strips leading filler interjections
|
||||||
|
# ("ok so", "oh", "well", "hey") so previews surface the actual content.
|
||||||
|
#
|
||||||
|
# Read-only. Safe to re-run.
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
set -eu
|
||||||
|
|
||||||
|
LOCATIONS=(
|
||||||
|
"$HOME/Library/Mobile Documents" "$HOME/Dropbox" "$HOME/Google Drive"
|
||||||
|
"$HOME/OneDrive" "$HOME/Documents" "$HOME/Desktop" "/Volumes"
|
||||||
|
)
|
||||||
|
|
||||||
|
TMP="$(mktemp)"; trap 'rm -f "$TMP" "$TMP.s"' EXIT
|
||||||
|
|
||||||
|
printf "Scanning backup locations" >&2
|
||||||
|
for loc in "${LOCATIONS[@]}"; do
|
||||||
|
[ -d "$loc" ] || continue
|
||||||
|
printf "." >&2
|
||||||
|
while IFS= read -r -d '' f; do
|
||||||
|
# Combined: shape detection (multi-line) + verb-aware topic preview
|
||||||
|
if preview="$(python3 - "$f" 2>/dev/null <<'PYEOF'
|
||||||
|
import json, sys, re
|
||||||
|
|
||||||
|
# Single-word/short greetings — message gets skipped entirely if it is just one of these
|
||||||
|
GREETINGS = {'hi','hey','hello','thanks','thank you','ok','okay','yes','no',
|
||||||
|
'sure','cool','great','good','done','yep','nope','perfect','copy'}
|
||||||
|
|
||||||
|
# Leading filler — interjections that get STRIPPED from the start of a message
|
||||||
|
# before the preview is taken. Iterative — handles "ok so well, then..." → "then..."
|
||||||
|
LEADING_FILLER = re.compile(
|
||||||
|
r'^(?:ok(?:ay)?|so|oh|well|anyway|btw|hmm+|um+|uh+|hey|hi|hello|right|'
|
||||||
|
r'yes|no|sure|cool|great|good|listen|look|wait|actually|alright|gotcha|'
|
||||||
|
r'yeah|yep|nope|nah)\b[\s,!.?:;-]*',
|
||||||
|
re.IGNORECASE
|
||||||
|
)
|
||||||
|
|
||||||
|
path = sys.argv[1]
|
||||||
|
shape_ok = False
|
||||||
|
preview = ""
|
||||||
|
try:
|
||||||
|
with open(path, 'r', errors='replace') as fh:
|
||||||
|
for i, line in enumerate(fh):
|
||||||
|
if i >= 30: break
|
||||||
|
try:
|
||||||
|
d = json.loads(line)
|
||||||
|
except Exception:
|
||||||
|
continue
|
||||||
|
if not isinstance(d, dict): continue
|
||||||
|
# Shape check — accept if any line in first 30 has session fields
|
||||||
|
if not shape_ok and 'sessionId' in d and 'timestamp' in d and 'message' in d:
|
||||||
|
shape_ok = True
|
||||||
|
# Preview — first user message after stripping leading filler
|
||||||
|
if not preview:
|
||||||
|
role = d.get('type', '') or d.get('message', {}).get('role', '')
|
||||||
|
if role == 'user':
|
||||||
|
content = d.get('message', {}).get('content', '')
|
||||||
|
if isinstance(content, list):
|
||||||
|
text = ' '.join(
|
||||||
|
c.get('text', '') for c in content
|
||||||
|
if isinstance(c, dict) and c.get('type') == 'text'
|
||||||
|
)
|
||||||
|
elif isinstance(content, str):
|
||||||
|
text = content
|
||||||
|
else:
|
||||||
|
text = ''
|
||||||
|
text = re.sub(r'\s+', ' ', text).strip()
|
||||||
|
# Skip messages that are pure greetings
|
||||||
|
if text.lower() in GREETINGS:
|
||||||
|
continue
|
||||||
|
# Iteratively strip leading filler tokens until stable
|
||||||
|
prev_text = None
|
||||||
|
while prev_text != text:
|
||||||
|
prev_text = text
|
||||||
|
text = LEADING_FILLER.sub('', text).strip()
|
||||||
|
# Skip if what remains is too short
|
||||||
|
if len(text) < 20:
|
||||||
|
continue
|
||||||
|
preview = text[:80] + ('...' if len(text) > 80 else '')
|
||||||
|
if shape_ok and preview: break
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
if shape_ok:
|
||||||
|
print(preview if preview else "(no preview — first 30 lines were greetings or short)")
|
||||||
|
sys.exit(0)
|
||||||
|
sys.exit(1)
|
||||||
|
PYEOF
|
||||||
|
)"; then
|
||||||
|
mtime="$(stat -f '%Sm' -t '%Y-%m-%d' "$f" 2>/dev/null || stat -c '%y' "$f" 2>/dev/null | cut -d' ' -f1)"
|
||||||
|
size="$(stat -f '%z' "$f" 2>/dev/null || stat -c '%s' "$f" 2>/dev/null)"
|
||||||
|
printf '%s\t%s\t%s\t%s\n' "$mtime" "$size" "$f" "$preview" >>"$TMP"
|
||||||
|
fi
|
||||||
|
done < <(find "$loc" -type f -name '*.jsonl' -print0 2>/dev/null)
|
||||||
|
done
|
||||||
|
printf "\n" >&2
|
||||||
|
|
||||||
|
count=$(wc -l <"$TMP" | tr -d ' ')
|
||||||
|
if [ "$count" -eq 0 ]; then
|
||||||
|
echo "No orphan Claude Code transcripts found in scanned backup locations."
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
sort -k1,1 "$TMP" >"$TMP.s"
|
||||||
|
oldest="$(head -n 1 "$TMP.s" | cut -f1)"
|
||||||
|
newest="$(tail -n 1 "$TMP.s" | cut -f1)"
|
||||||
|
echo "Found $count orphan Claude Code transcript(s). Oldest: $oldest Newest: $newest"
|
||||||
|
echo "----------------------------------------------------------------------"
|
||||||
|
awk -F'\t' '{ printf "%s %10s %s\n \"%s\"\n\n", $1, $2, $3, $4 }' "$TMP.s"
|
||||||
Executable
+80
@@ -0,0 +1,80 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""render_jsonl.py — turn one Claude Code JSONL transcript into readable text.
|
||||||
|
|
||||||
|
Claude Code stores conversations at ~/.claude/projects/<proj>/<uuid>.jsonl and
|
||||||
|
Anthropic auto-deletes them after 30 days
|
||||||
|
(https://docs.claude.com/en/docs/claude-code/data-usage). This script renders a
|
||||||
|
JSONL into a clean .txt so you can keep / read / share it without the tooling.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 render_jsonl.py <input.jsonl> [output.txt]
|
||||||
|
|
||||||
|
Stdlib only. Python 3.9+. Read-only on the input.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
|
||||||
|
def extract_text(content):
|
||||||
|
if isinstance(content, str):
|
||||||
|
return content.strip()
|
||||||
|
if isinstance(content, list):
|
||||||
|
parts = []
|
||||||
|
for blk in content:
|
||||||
|
if isinstance(blk, dict) and blk.get("type") == "text":
|
||||||
|
t = (blk.get("text") or "").strip()
|
||||||
|
if t:
|
||||||
|
parts.append(t)
|
||||||
|
return "\n".join(parts)
|
||||||
|
return ""
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
if len(sys.argv) < 2:
|
||||||
|
print(__doc__)
|
||||||
|
sys.exit(1)
|
||||||
|
src = Path(sys.argv[1])
|
||||||
|
if not src.is_file():
|
||||||
|
print(f"ERROR: not a file: {src}")
|
||||||
|
sys.exit(1)
|
||||||
|
out = open(sys.argv[2], "w", encoding="utf-8") if len(sys.argv) > 2 else sys.stdout
|
||||||
|
|
||||||
|
turns, stamps = [], []
|
||||||
|
for raw in src.read_text(encoding="utf-8", errors="replace").splitlines():
|
||||||
|
if not raw.strip():
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
obj = json.loads(raw)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
continue
|
||||||
|
role = obj.get("type") or (obj.get("message") or {}).get("role")
|
||||||
|
if role not in ("user", "assistant"):
|
||||||
|
continue
|
||||||
|
msg = obj.get("message") or obj
|
||||||
|
text = extract_text(msg.get("content"))
|
||||||
|
if not text:
|
||||||
|
continue
|
||||||
|
ts = obj.get("timestamp") or ""
|
||||||
|
if ts:
|
||||||
|
stamps.append(ts)
|
||||||
|
turns.append((ts, role, text))
|
||||||
|
|
||||||
|
header = [
|
||||||
|
f"# Claude Code transcript: {src}",
|
||||||
|
f"# Total turns: {len(turns)}",
|
||||||
|
f"# Date range : {min(stamps) if stamps else 'n/a'} -> {max(stamps) if stamps else 'n/a'}",
|
||||||
|
"#" + "-" * 70,
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
out.write("\n".join(header))
|
||||||
|
for ts, role, text in turns:
|
||||||
|
out.write(f"\n[{ts}] {role.upper()}\n{text}\n\n{'-'*72}\n")
|
||||||
|
if out is not sys.stdout:
|
||||||
|
out.close()
|
||||||
|
print(f"Wrote {len(turns)} turns to {sys.argv[2]}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,26 @@
|
|||||||
|
---
|
||||||
|
description: Save the current Claude Code session into MemPalace. Idempotent — won't dupe.
|
||||||
|
---
|
||||||
|
|
||||||
|
# /save
|
||||||
|
|
||||||
|
Save the current Claude Code session into MemPalace. Run this when you
|
||||||
|
want a checkpoint. Safe to run repeatedly — drawer IDs are content-hashed
|
||||||
|
so re-running on the same session overwrites in place, no duplicates.
|
||||||
|
|
||||||
|
Behavior:
|
||||||
|
|
||||||
|
1. Find the current session's JSONL transcript path (Claude Code passes
|
||||||
|
it via the conversation context — look for `~/.claude/projects/` paths).
|
||||||
|
2. Run via bash:
|
||||||
|
|
||||||
|
```
|
||||||
|
mempalace mine "<TRANSCRIPT_PATH>" --mode convos --wing claude_imports
|
||||||
|
```
|
||||||
|
|
||||||
|
3. If the user supplied an argument after `/save`, use it as the wing name
|
||||||
|
instead of `claude_imports` (e.g. `/save my_research` →
|
||||||
|
`--wing my_research`).
|
||||||
|
4. Report back: how many drawers were filed, into which wing/room.
|
||||||
|
|
||||||
|
Requires `mempalace` to be installed (`pip install mempalace`).
|
||||||
Reference in New Issue
Block a user