* fix: register 0-chunk files to prevent re-processing on every mine (#654)
mine_convos() has three early-exit paths (OSError, content too short,
zero chunks) that skip writing anything to ChromaDB. Since
file_already_mined() checks for the presence of a document with a
matching source_file, these files are re-read and re-processed on
every subsequent run.
Add _register_file() that upserts a lightweight sentinel document
(room="_registry", ingest_mode="registry") so file_already_mined()
returns True on future runs.
Note: Bug 2 from the issue (drawers_added counter always 0) was
already resolved upstream via the switch from collection.add() to
collection.upsert().
* fix: resolve macOS path symlink in test + remove unused variable
* fix: return "general" room from process_file error paths (#586)
process_file() returned (0, None) for already-mined, unreadable, and
too-short files. In --dry-run mode the caller always enters the
room_counts branch, so None ended up as a dict key and crashed the
summary printer with "unsupported format string passed to
NoneType.__format__".
Returning "general" instead of None makes the function contract
explicit: it always yields (int, str). This matches the consensus
fix discussed in the issue thread.
* style: apply ruff format to test_miner.py
* fix: allow Unicode in sanitize_name() — Latvian, CJK, Cyrillic names (#637)
_SAFE_NAME_RE was ASCII-only ([a-zA-Z0-9]), rejecting valid Unicode
names like "Jānis" or "太郎". Changed to \w which matches Unicode
word characters (letters, digits, underscore) in Python 3.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: tighten Unicode regex, add sanitize_name tests
Use [^\W_] for first/last char to allow Unicode letters/digits but
reject leading/trailing underscores (Copilot feedback). Add 7 tests
covering Latvian, CJK, Cyrillic, path traversal, and edge cases.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Note from code review: (1) silent exception swallow on migration failure means caller proceeds with potentially corrupt DB — consider returning a boolean or re-raising in a follow-up. (2) No blob length validation before int.from_bytes — malformed rows could produce wrong seq_id values. Both are edge cases; the fix is still valuable for the common chromadb 0.6→1.5 migration path.
* refactor: add stage-1 backend abstraction seam
Introduce the first upstreamable storage seam for MemPalace without
bringing in the PostgreSQL spike or any benchmark artifacts.
This change adds a small backend package with:
- BaseCollection as the minimal collection contract
- ChromaBackend/ChromaCollection as the default implementation
It then routes the main runtime collection consumers through that seam:
- palace.py
- searcher.py
- layers.py
- palace_graph.py
- mcp_server.py
- miner.status()
Behavioral constraints kept for stage 1:
- ChromaDB remains the only backend and the default path
- no config/env backend selection yet
- no PostgreSQL code
- no benchmark or research files
- existing tests stay unchanged
Important compatibility details:
- read paths now call the seam with create=False so they still surface
the existing 'no palace found' behavior instead of silently creating
empty collections
- write paths keep create=True semantics through palace.get_collection()
- layers/searcher retain a chromadb module attribute so the existing
mock-based tests can keep patching PersistentClient unchanged
- ChromaBackend only creates palace directories on create=True, which
preserves mocked read-path tests that use fake read-only paths
Verification:
- python3 -m py_compile mempalace/backends/__init__.py mempalace/backends/base.py mempalace/backends/chroma.py mempalace/palace.py mempalace/searcher.py mempalace/layers.py mempalace/palace_graph.py mempalace/mcp_server.py mempalace/miner.py
- pytest -q # 529 passed, 106 deselected
* refactor: clean up stage-1 seam compatibility shims
Tighten the stage-1 backend abstraction branch after review.
This follow-up does three small things:
- keep the chromadb compatibility hook in searcher.py and layers.py,
but express it through the backends.chroma module so it no longer
reads like an accidental unused import
- fix the palace_graph.py helper alias to avoid the local name collision
flagged by ruff (imported helper vs local _get_collection wrapper)
- preserve the existing mock-based test patch points unchanged while
keeping the new backend seam intact
Why this matters:
- the direct form looked like a
dead import in review, even though it was intentionally preserving the
existing test seam ( and
)
- palace_graph.py had a real lint issue ( redefinition) that was
small but worth fixing before a public PR
Verification:
- /opt/homebrew/bin/ruff check mempalace/backends/__init__.py mempalace/backends/base.py mempalace/backends/chroma.py mempalace/palace.py mempalace/searcher.py mempalace/layers.py mempalace/palace_graph.py mempalace/mcp_server.py mempalace/miner.py
- pytest -q tests/test_layers.py tests/test_searcher.py
- pytest -q # 529 passed, 106 deselected
* docs: explain backend shim imports in search paths
Add short code comments in searcher.py and layers.py explaining why the
module-level `chromadb` alias remains after the stage-1 backend seam
refactor.
The alias is intentional: it preserves the existing mock patch points used
by the current test suite (`mempalace.searcher.chromadb.PersistentClient`
and `mempalace.layers.chromadb.PersistentClient`) while the runtime logic
now flows through the backend abstraction.
This keeps the public PR easier to review because the apparent "unused
import" now has an explicit reason next to it.
Verification:
- /opt/homebrew/bin/ruff check mempalace/searcher.py mempalace/layers.py
- pytest -q tests/test_layers.py tests/test_searcher.py
* refactor: reuse a default backend instance in palace helper
Tighten the stage-1 backend seam by promoting the default Chroma backend
adapter to a module-level singleton in `mempalace/palace.py`.
This keeps the stage-1 scope unchanged — Chroma is still the only backend
wired in this branch — but avoids constructing a fresh `ChromaBackend()`
object on every `get_collection()` call. The backend is stateless today,
so this is a readability/cleanup change rather than a behavioral one.
Why this helps:
- makes `palace.get_collection()` read like a real default factory instead
of an inline constructor call
- keeps the stage-1 branch a little cleaner before opening the public PR
- does not widen the backend surface or change any config/runtime behavior
Verification:
- python3 -m py_compile mempalace/palace.py
- pytest -q tests/test_miner.py tests/test_layers.py tests/test_searcher.py
- pytest -q # 529 passed, 106 deselected
* fix: harden read-only seam behavior and update seam tests
Preserve the stage-1 backend abstraction while closing the real read-path
regression surfaced in PR review.
What changed:
- make ChromaBackend.get_collection(create=False) fail fast when the palace
directory does not exist instead of letting PersistentClient create it as a
side effect
- update miner.status() to call get_collection(..., create=False) so status
keeps the historical 'No palace found' behavior
- remove the temporary chromadb shim aliases from layers.py and searcher.py
now that the tests patch the seam directly
- add focused tests for the new backends package, including ChromaCollection
delegation and ChromaBackend create=True/create=False behavior
- retarget layer/searcher tests to patch the backend seam instead of patching
chromadb.PersistentClient inside production modules
- add a regression test that status() does not create an empty palace when the
target path is missing
Verification:
- ruff check .
- uv run pytest -q
- uv run pytest -q tests/test_backends.py tests/test_cli.py tests/test_mcp_server.py tests/test_layers.py tests/test_searcher.py tests/test_miner.py
Notes:
- the separate benchmark/slow/stress layer was started as a soak but not used
as the merge gate for this PR branch
* refactor: drop duplicate mcp collection cache declaration
Remove a redundant `_collection_cache = None` assignment in
`mempalace/mcp_server.py` left over after the stage-1 backend seam refactor.
This does not change behavior; it only trims review noise in the MCP server
module after the read-path hardening pass.
Verification:
- ruff check mempalace/mcp_server.py
- uv run pytest -q tests/test_mcp_server.py
---------
Co-authored-by: Sergey Kuznetsov <sergey@iterudit.com>
On Windows, projects containing git-submodule junctions or dev-drive
reparse points cause iterdir() to list the entry successfully but
Path.is_dir() to raise OSError when it calls stat() internally.
Reproducer: any Windows project with a submodule checked out as a
junction (e.g. skills/pr-perfect) crashes mempalace init with:
OSError: [WinError 448] The path cannot be traversed because it
contains an untrusted mount point
Fix: wrap every is_dir() call in detect_rooms_from_folders with
try/except OSError so the scanner skips inaccessible entries and
continues rather than aborting.
Covers both the top-level pass and the one-level-deep nested pass.
Two new tests mock the OSError on specific paths and verify the
function returns correct rooms from the remaining accessible entries.
Addresses Issue #333: AI agents prepending system prompts to search queries
causes embedding retrieval to collapse (89.8% → 1.0% R@10).
Mitigation approach (減災):
- New query_sanitizer.py with 4-stage pipeline:
Step 1: passthrough for short queries (≤200 chars)
Step 2: question extraction (finds ? sentences) → ~85-89% recovery
Step 3: tail sentence extraction → ~80-89% recovery
Step 4: tail truncation fallback → ~70-80% recovery
Worst case without sanitizer: 1.0% (catastrophic)
Worst case with sanitizer: ~70-80% (survivable)
- mcp_server.py: tool_search applies sanitizer before ChromaDB query
- MCP schema: query description warns agents not to include prompts
- New 'context' parameter separates background info from search intent
- Sanitizer metadata included in response when triggered
22 new tests covering all pipeline stages and real-world scenarios.
Made-with: Cursor
The initialize handler hardcoded protocolVersion "2024-11-05", which
causes newer MCP clients (e.g. Claude Code) to reject the connection
when they negotiate "2025-11-25" or later.
Echo the client's requested version if it is in the supported set,
otherwise fall back to the latest supported version. This keeps
backwards compatibility with older clients while allowing newer ones
to connect.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merged both the PR's benchmark suite additions (psutil dep, pytest
markers, --ignore=tests/benchmarks) and upstream's coverage changes
(pytest-cov, --cov-fail-under=30, coverage config) so both coexist.
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
_patch_mcp_server had palace_path removed from its signature but the
assertion body still referenced it, causing NameError at runtime and
F821 from ruff.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add tests for config, convo_miner, spellcheck, knowledge_graph
- Fix Windows PermissionError in test cleanup (chromadb file locks)
- Add UTF-8 encoding to split_mega_files, entity_registry, hooks_cli
- Fix mcp_server parse_known_args logging for unknown args
- Set coverage threshold to 85 in pyproject.toml and CI
- Reset all version files to 3.0.11
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MCP tool_add_drawer:
- Make drawer_id content-based: hash full content instead of
content[:100] + timestamp. Same content → same ID, eliminating
TOCTOU race conditions
- Switch from col.add() to col.upsert() so re-filing with updated
content updates the existing drawer
miner.add_drawer:
- Switch from collection.add() to collection.upsert() so re-mining
a modified file updates instead of silently failing
- Remove the try/except catching 'already exists' — upsert handles
this naturally
Findings: #11 (HIGH — add ignores updates), #6 (MEDIUM — TOCTOU),
#13 (MEDIUM — non-deterministic IDs)
Includes test infrastructure from PR #131.
92 tests pass.
Add/expand tests for normalize (39%→97%), searcher (39%→100%),
layers (28%→97%), split_mega_files (34%→72%).
Fix mcp_server.py parse_args→parse_known_args to prevent SystemExit
when imported during pytest (CI was crashing on all test jobs).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Run ruff format on all benchmark files (fixes CI lint job)
- Fix check_regression() substring ambiguity: ordered keyword matching
so "latency_improvement_pct" is correctly classified as higher-is-better
- Update stale comments in conftest.py referencing wrong fixture
- Add pytest addopts to skip benchmark/slow/stress markers by default
Concentrates all drawers into a single wing+room to isolate the
embedding model's retrieval limit independent of palace filtering.
Confirms recall degrades to ~0.4-0.5 at 5K drawers per room even
with wing+room filters applied — the spatial structure helps by
keeping buckets small, but can't fix the underlying embedding ceiling.
Benchmark mempalace at configurable scale (1K–100K drawers) to find
real-world performance limits. Tests cover MCP tool OOM thresholds,
ChromaDB query degradation, search recall@k, mining throughput,
knowledge graph concurrency, memory leak detection, palace boost
quantification, and Layer1 unbounded fetch behavior.
- tests/benchmarks/ with 8 test modules + data generator + report system
- Deterministic data factory with planted needles for recall measurement
- JSON report output with regression detection (--bench-report flag)
- CI benchmark job on PRs at small scale
- psutil added as dev dependency for RSS tracking
The MCP server previously created a new PersistentClient on every tool
call via _get_collection(). This incurs HNSW index loading overhead
on each request.
Cache the client and collection at module level. The cache resets
naturally on process restart (MCP runs as a subprocess).
Also adds a _reset_mcp_cache fixture to conftest.py for test isolation.
Includes test infrastructure from PR #131.
92 tests pass.