Commit Graph

133 Commits

Author SHA1 Message Date
mvalentsev 5189e0d652 test(i18n): add entity section smoke tests and schema invariants 2026-04-18 21:58:11 +05:00
Igor Lins e Silva 4a088ea8e1 Address Copilot review: cursor tie-break, honest metrics, accurate comments
Six items from the automated review on PR #998:

1. **Cursor tie-break bug (correctness).** The skip condition was
   `rec.timestamp <= cursor`; if multiple messages share the max
   timestamp and only some were ingested before a crash, the rest
   would be lost forever. Changed to `< cursor`, relying on
   deterministic drawer IDs for safe re-attempt at the boundary.
   Regression test
   `test_sweep_recovers_untaken_message_at_cursor_timestamp`.

2. **`drawers_added` counted upserts, not adds.** Added a pre-flight
   `collection.get(ids=batch)` to distinguish new rows from already-
   present ones. Return value now carries `drawers_added`,
   `drawers_already_present`, `drawers_upserted`, and `drawers_skipped`
   separately. Dict-compatible access (`existing.get("ids")`) keeps it
   working on both the raw Chroma return and the typed `GetResult`.

3. **`sweep_directory` hid failures in the summary.** `files_processed`
   used to exclude failed files. Replaced with `files_attempted` (all
   discovered) + `files_succeeded` (subset that completed); CLI output
   shows `succeeded/attempted`.

4. **Coordination claim was overstated.** The primary miners don't
   stamp `session_id`/`timestamp` metadata, so the sweeper coordinates
   only with its own prior runs. Softened docstrings on module and CLI
   command. Uniform cross-miner metadata is flagged as a follow-up.

5. **MAX_FILE_SIZE comments were misleading.** Said source size "does
   not affect storage or embedding cost" — true per-drawer, but source
   size still scales drawer count, embedding work, and memory usage
   (files are read in full, not streamed). Corrected in both
   `miner.py` and `convo_miner.py`.

6. Added the tie-break regression test that reproduces the correctness
   bug from (1).

Tests: 970 passed (was 969), ruff + pre-commit clean.

Co-Authored-By: MSL <232237854+milla-jovovich@users.noreply.github.com>
2026-04-18 13:22:18 -03:00
Igor Lins e Silva 29ce7c7135 Harden sweeper for production: verbatim tool blocks, full session_id, logged failures
Four changes on top of the proposal's initial sweeper draft, driven by
the CLAUDE.md design principles:

1. Drop the 500-char truncation on tool_use / tool_result content in
   _flatten_content. The "verbatim always" principle forbids lossy
   compression of user-adjacent data; a long code-edit diff handed to
   the assistant must round-trip intact. Unknown block types now also
   serialize their full payload instead of just a type marker. New test
   test_parse_preserves_tool_blocks_verbatim covers a 5000-char input.

2. Use the full session_id in drawer IDs (not session_id[:12]). Rules
   out cross-session collisions if a transcript source ever uses
   non-UUID session identifiers or shared prefixes.

3. Replace silent `except Exception: return None` in get_palace_cursor
   with a logger.warning — the exact anti-pattern this PR otherwise
   criticizes in miner.py. The fallback behavior is still safe
   (deterministic IDs make a missed cursor recover on the next run),
   but the failure is now discoverable.

4. sweep_directory now collects per-file failures into the result dict
   and the CLI exits non-zero when any file failed, so a partial-sweep
   outcome is visible rather than swallowed.

Co-Authored-By: MSL <232237854+milla-jovovich@users.noreply.github.com>
2026-04-18 13:14:32 -03:00
MSL fed69935d3 Add tandem sweeper: message-level safety net for dropped transcripts
The primary miners (miner.py, convo_miner.py) operate at file
granularity and can drop data for several reasons: size caps, silent
OSError on read, dedup false positives, extensions the project miner
does not recognize. Even with tonight's hotfixes, any future bug in
the file-level path risks silent data loss.

The sweeper is a second, cooperating miner that works at MESSAGE
granularity:

  - Parses Claude Code .jsonl line by line, yielding only
    user/assistant records (filters progress, file-history-snapshot,
    etc. noise).
  - For each session_id, queries the palace for max(timestamp) and
    treats that as the cursor.
  - Ingests only messages newer than the cursor, as one small drawer
    per exchange (never hits a size cap — each drawer is 1-5 KB).
  - Deterministic drawer IDs from session_id + message UUID make
    reruns idempotent; crash mid-sweep is safe.

Tandem coordination is free: if the primary miner committed up to
timestamp T, the sweeper resumes from T. If the primary miner missed
everything, the sweeper catches it all. Neither duplicates the other.

Smoke test on a real Claude Code transcript:
  1st run: +39 drawers, 0 already present
  2nd run: +0 drawers, 39 already present  (perfect idempotence)

Opt-in via:
  mempalace sweep <file.jsonl>
  mempalace sweep <transcript-dir>

No changes to existing miners. No schema migration. Purely additive.

Tests: tests/test_sweeper.py (7 tests covering parsing, tandem
coordination, idempotency, resume-from-cursor, metadata correctness).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:52:06 -03:00
MSL 6f33d52681 Raise convo_miner MAX_FILE_SIZE cap 10 MB → 500 MB
Mirrors the miner.py fix in this same branch. convo_miner.py had the
exact same 10 MB cap at line 58 that silently dropped long transcripts
via continue. Long Claude Code sessions, multi-year ChatGPT exports,
and lifetime Slack dumps all exceed 10 MB. Same silent-drop pattern,
different file.

Raised to 500 MB to match miner.py for consistency; downstream chunking
means source file size does not affect storage or embedding cost.

Tests: tests/test_convo_miner_size_cap.py (1 test)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:52:01 -03:00
MSL d137d12313 Raise MAX_FILE_SIZE cap from 10 MB to 500 MB
Long Claude Code sessions routinely produce transcripts larger than 10
MB. The previous cap at miner.py:65 silently dropped them at line 732
with `if filepath.stat().st_size > MAX_FILE_SIZE: continue` — same
silent-failure pattern as the .jsonl extension bug.

The cap exists as a safety rail against pathological binaries, not as
a limit on legitimate text. Downstream chunking at 800 chars per drawer
means source file size does not affect storage or embedding cost.

500 MB leaves headroom for year-long continuous transcripts while still
catching accidental multi-GB binary mines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:52:01 -03:00
MSL 560fdbdc9f Fix silent drop of .jsonl files in project miner
mempalace/miner.py:READABLE_EXTENSIONS contained `.json` but not
`.jsonl`. Every jsonl file encountered in a mined directory was
silently skipped at miner.py:722:

    if filepath.suffix.lower() not in READABLE_EXTENSIONS:
        continue

Claude Code transcripts, ChatGPT exports, and every other tool writing
line-delimited JSON ship as `.jsonl`. Users running `mempalace mine`
against a directory of transcripts saw the command complete with no
error and no log line — and their conversations never reached the
palace. Silent data loss.

Adding `.jsonl` to the whitelist alongside `.json`. jsonl is text
line-by-line; the existing chunking pipeline handles it the same way
it handles any other text file.

Tests: tests/test_miner_jsonl_visibility.py

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:52:01 -03:00
Igor Lins e Silva 55a004fe1e Merge pull request #931 from mvalentsev/fix/i18n-entity-metadata
fix: use i18n candidate patterns for entity extraction in miner and palace
2026-04-16 15:54:01 -03:00
Igor Lins e Silva c5e249bba8 Merge pull request #946 from mvalentsev/fix/utf8-read-text
fix: add explicit UTF-8 encoding to read_text() calls (#776)
2026-04-16 15:52:42 -03:00
Igor Lins e Silva 65f99ad7e6 Merge pull request #928 from arnoldwender/fix/i18n-lang-case-insensitive
fix(i18n): resolve language codes case-insensitively (#927)
2026-04-16 15:44:36 -03:00
mvalentsev 09fe2dda3c fix: add explicit UTF-8 encoding to read_text() calls (#776)
On Windows with non-UTF-8 locale (e.g. GBK), Path.read_text() defaults
to platform encoding, breaking onboarding tests and any source code that
reads JSON/markdown with non-ASCII content.

5 files, 8 call sites fixed.
2026-04-16 16:00:29 +05:00
🍕 88f5b5fa0e Add Indonesian language support
Introduces the Indonesian (id) locale, providing translations for CLI commands, status messages, and core terminology.

Includes language-specific regex patterns for stop words and action detection to support text processing and indexing in Indonesian. The test suite is updated with a sample case to verify correct dialect handling and compression.
2026-04-16 16:15:47 +08:00
mvalentsev 8bf940f861 fix: use i18n candidate patterns for entity extraction in miner and palace
entity_detector.py was refactored in #911 to load candidate patterns
from i18n locale JSON files, supporting non-Latin scripts (Cyrillic,
accented Latin, etc.). But three other code paths still hardcoded the
ASCII-only regex [A-Z][a-z]{2,}, silently missing non-Latin entity
names in metadata tagging, closet indexing, and registry lookups.

Replace the hardcoded regex with a shared _candidate_entity_words()
helper that reuses the same i18n candidate_patterns as entity_detector.
2026-04-16 10:35:40 +05:00
Igor Lins e Silva f895bc58e6 fix(entity_detector): script-aware word boundaries for combining-mark scripts
Python's \b is a \w/non-\w transition. Devanagari vowel signs (matras)
like ा ी ु are Unicode category Mc (Mark, Spacing Combining) — not \w.
This means \b splits mid-word on every matra: names like अनीता (Anita)
truncate to अनीत, and person-verb patterns like \bराज\s+ने\s+कहा\b
never match because \b fails after the final matra of कहा.

Same issue affects Arabic, Hebrew, Thai, Tamil, and every other script
whose words contain combining marks.

Fix: locales with combining-mark scripts declare a boundary_chars field
in their entity section (e.g. "\\w\\u0900-\\u097F" for Hindi). The i18n
loader replaces every \b in that locale's patterns with a script-aware
lookaround that treats the declared characters as "inside-word", and
pre-wraps candidate/multi_word patterns with the same boundary.

Default behavior (no boundary_chars) keeps standard \b — en, pt-br, ru,
it are unchanged.

Changes:
- mempalace/i18n/__init__.py: add _script_boundary, _expand_b,
  _wrap_candidate, _collect_entity_section; candidate_patterns are now
  returned fully-wrapped (boundary + capture group applied)
- mempalace/entity_detector.py: extract_candidates compiles pre-wrapped
  candidate patterns directly instead of re-wrapping with \b
- tests/test_entity_detector.py: 5 new tests for Devanagari boundaries
  (name extraction with/without boundary_chars, person-verb firing,
  English regression)
2026-04-15 22:18:52 -03:00
Arnold Wender 0174b93d0f fix(i18n): resolve language codes case-insensitively (#927)
BCP 47 language tags are case-insensitive (RFC 5646 §2.1.1) but the
locale files mix conventions (pt-br.json vs zh-CN.json). On
case-sensitive filesystems, '--lang PT-BR' or '--lang zh-cn' silently
missed the file, _load_entity_section returned {}, and entity
detection ran in English with no warning.

The cache key in get_entity_patterns was built from raw input, so
('PT-BR',) and ('pt-br',) produced two distinct entries, both wrong.

Add _canonical_lang(lang) that resolves any casing to the on-disk
filename stem via lowercase comparison, and route load_lang,
_load_entity_section, and the cache key through it.

Closes #927
2026-04-15 23:33:42 +02:00
Igor Lins e Silva 312b3b5f0e Merge pull request #758 from mvalentsev/fix/i18n-review-issues
fix: address i18n review issues from PR #718
2026-04-15 13:45:49 -03:00
Igor Lins e Silva c722c91e2a test: document orphan-locale recovery for _temp_locale helper 2026-04-15 08:54:23 -03:00
Igor Lins e Silva b214aced90 refactor(entity_detector): make multi-language extensible via i18n JSON
Move all entity-detection lexical patterns (person verbs, pronouns,
dialogue markers, project verbs, stopwords, candidate character class)
out of hardcoded module-level constants and into the entity section of
each locale's JSON in mempalace/i18n/. Adds a languages parameter to
every public function so callers union patterns across the desired
locales. The default stays ("en",), so all existing callers and tests
behave unchanged.

Also adds:
- get_entity_patterns(langs) helper in mempalace/i18n/ that merges
  patterns across requested languages, dedupes lists, unions stopwords,
  and falls back to English for unknown locales
- MempalaceConfig.entity_languages property + setter, with env var
  override (MEMPALACE_ENTITY_LANGUAGES, comma-separated)
- mempalace init --lang en,pt-br flag (persists to config.json)
- Per-language candidate_pattern so non-Latin scripts (Cyrillic,
  Devanagari, CJK) can register their own character classes instead of
  being silently dropped by the ASCII-only [A-Z][a-z]+ default
- _build_patterns LRU cache keyed by (name, languages) so multi-language
  callers don't poison each other's cache slots

Why now: the open language PRs (#760 ru, #773 hi, #778 id, #907 it) only
add CLI strings via mempalace/i18n/. PR #156 (pt-br) is the first that
needed entity_detector changes and inlined a _PTBR variant of every
constant. That doesn't scale past 2-3 languages — every text gets
checked against every language's patterns regardless of relevance, and
candidate extraction still drops accented and non-Latin names.

This PR sets the standard so future locale contributors only edit one
JSON file (no Python changes), and entity detection scales linearly
with how many languages a user actually enabled, not how many ship.
2026-04-15 08:52:42 -03:00
Igor Lins e Silva 56b6a6360f Merge pull request #908 from fatkobra/test/palace-graph-tunnels
test: add palace_graph tunnel helper coverage
2026-04-15 08:23:18 -03:00
fatkobra 966937d620 test: add palace_graph tunnel helper coverage
Adds focused tests for explicit tunnel helpers in `mempalace/palace_graph.py`.

Covered:
- `_load_tunnels`
- `_save_tunnels`
- `create_tunnel`
- `list_tunnels`
- `delete_tunnel`
- `follow_tunnels`
2026-04-15 11:38:18 +02:00
Marcio E. Heiderscheidt e61dc2adf8 fix: add provenance header and speaker IDs to Slack transcript imports (#815)
* fix: add provenance header and speaker IDs to Slack transcript imports

Slack exports are multi-party chats where no speaker is inherently
the "user" or "assistant". The parser previously assigned these roles
purely by position, allowing a crafted export to place attacker text
in the "user" role — making it appear as the memory owner's words
in all future retrieval (data poisoning via stored memory).

Changes:
- Add provenance header marking Slack transcripts as multi-party
  with positional (unverified) role assignment
- Prefix each message with the original speaker ID ([U1], [U2], etc.)
  so downstream consumers can distinguish authors
- Keep user/assistant role alternation for exchange-pair chunking
  compatibility with convo_miner.py

Tests:
- Provenance header presence and content
- Speaker ID preservation in output
- Attacker-first-message attribution verification

Refs: MemPalace/mempalace#809

* fix: move Slack provenance to footer, sanitize speaker IDs, extract constant

- Move provenance notice from header to footer to prevent it becoming
  a standalone ChromaDB drawer via paragraph chunking on exports
  with fewer than 3 exchange pairs (violates verbatim-always principle)
- Sanitize speaker user_id/username: strip brackets, newlines, and
  control characters to prevent chunk-boundary injection via crafted
  Slack exports
- Extract header string to _SLACK_PROVENANCE_FOOTER module constant,
  consistent with _TOOL_RESULT_* constants pattern; tests import it
  instead of duplicating the literal

Refs: MemPalace/mempalace#809
2026-04-15 00:27:01 -07:00
sha2fiddy a15094ce60 feat: include created_at timestamp in search results (#846)
* feat: include created_at timestamp in search results (closes #465)

Surface the existing filed_at metadata as created_at in search result
objects returned by search_memories(). Enables temporal reasoning over
search hits without additional queries.

* Feat: add fallback for missing filed_at metadata
2026-04-15 00:26:57 -07:00
Mikhail Valentsev ecd44f7cb7 fix(hooks): stop precompact hook from blocking compaction (#856, #858) (#863)
* fix(hooks): stop precompact hook from blocking compaction

The precompact hook unconditionally returned {"decision": "block"},
which in Claude Code means "cancel compaction" with no retry mechanism.
This made /compact permanently broken for all plugin users.

Changed hook_precompact() to mine the transcript synchronously (so data
lands before compaction) and return {"decision": "allow"}. This matches
the standalone bash hook in hooks/ which already uses allow.

Also extracted _get_mine_dir() and _mine_sync() helpers so precompact
can mine from the transcript directory, not just MEMPAL_DIR.

Stop hook behavior is unchanged -- left for #673 which implements the
full silent save path.

Closes #856, closes #858.

* fix: use empty JSON instead of invalid \"allow\" decision value

Claude Code only recognizes \"block\" as a top-level decision value.
\"allow\" is a permissionDecision value for PreToolUse hooks, not a
valid top-level decision. The correct way to not block is to return
empty JSON. Caught by #872.
2026-04-15 00:26:54 -07:00
Arnold Wender b226251ddf fix(mcp): redirect stdout to stderr during import to protect JSON-RPC channel (#225) (#864)
* fix(mcp): redirect stdout to stderr during import to protect JSON-RPC channel (#225)

Fixes #225.

Several transitive dependencies (chromadb, onnxruntime, posthog) print
banners and warnings to stdout — sometimes at the C level — during the
mcp_server import chain. Because the MCP protocol multiplexes JSON-RPC
over stdio, any non-JSON output on stdout corrupted the message stream
and broke Claude Desktop's parser with errors like:

  MCP mempalace: Unexpected token '*', "**********"... is not valid JSON
  MCP mempalace: Unexpected token 'E', "EP Error D"... is not valid JSON
  MCP mempalace: Unexpected token 'F', "Falling ba"... is not valid JSON

Reproduced on Windows 11 with mempalace 3.0.0 / Python 3.10 / Claude
Desktop 1.1062.0.

Fix: at module load, redirect stdout to stderr at both the Python level
(sys.stdout = sys.stderr) and the file-descriptor level (os.dup2(2, 1))
to catch C-level prints, while preserving the real stdout for later
restore. main() calls _restore_stdout() right before entering the
protocol loop so JSON-RPC responses still go to the real stdout.

Adds tests/test_mcp_stdio_protection.py with three regression tests:
- module-level redirect is in place after import
- _restore_stdout() restores the original stdout (idempotent)
- 'python -m mempalace.mcp_server' with empty stdin emits no stdout

* style: reformat with ruff 0.4 (CI version) for #225
2026-04-15 00:26:51 -07:00
Arnold Wender 0aee6f3ed9 fix(init): auto-add per-project files to .gitignore in git repos (#185) (#866)
Partially addresses #185.

`mempalace init <dir>` writes `mempalace.yaml` and `entities.json` into
the project root. When <dir> is a git repository, those files have no
default protection and risk being committed by accident — the loudest
concern in the original report.

This PR adds `_ensure_mempalace_files_gitignored()` which runs at the
end of cmd_init: if <dir>/.git exists, append the two filenames to
.gitignore (creating it if necessary) under a clearly-marked block.

The helper is conservative:
- only runs when <dir>/.git is present (no-op for non-git projects)
- skips entries already present (no duplicates)
- preserves existing .gitignore content
- handles files without trailing newlines

This does NOT relocate the files to ~/.mempalace/wings/<wing>/ as the
issue's 'Expected' section proposes — that's a behavioral change with
miner/config implications and warrants a separate design discussion.
The gitignore safeguard removes the immediate risk without breaking any
existing flow.

Tests: 5 cases in tests/test_init_gitignore_protection.py covering
no-op, fresh creation, partial append, idempotency, and missing-newline
edge case.
2026-04-15 00:26:41 -07:00
Arnold Wender 6a73eb2e20 fix(searcher): guard against empty ChromaDB query results (#195) (#865)
Fixes #195.

When ChromaDB returns no documents (empty palace, or wing/room filter
that excludes everything), it returns the shape:

    {"documents": [], "metadatas": [], "distances": []}

Indexing `results["documents"][0]` blindly raises IndexError instead of
the expected 'no results' response. Affected: searcher.search(),
searcher.search_memories() (drawer + closet branches plus the
total_before_filter aggregate), and Layer3.search() / Layer3.search_raw().

Adds a tiny private helper `searcher._first_or_empty(results, key)` that
safely extracts the inner list, returning [] for any of: missing key,
empty outer list, [None], or [[]]. layers.py imports the same helper to
avoid duplicating the guard.

Tests: tests/test_empty_chromadb_results.py covers all observed shapes
plus a documentation-style test that pins the original IndexError so
future readers understand why the helper exists.
2026-04-15 00:26:38 -07:00
Mikhail Valentsev 54a386d925 fix: return empty status instead of error on cold-start palace (#830) (#831)
tool_status() called _get_collection() with the default create=False,
which throws when the ChromaDB collection does not exist yet (valid
palace, zero drawers). The exception was swallowed and status returned
"No palace found" even though init had completed successfully.

Switching to create=True bootstraps an empty collection on first
status call, matching what the write path already does.

Fix suggested by @hkevinchu in the issue.
2026-04-15 00:26:35 -07:00
Marcio E. Heiderscheidt f20f45a2da fix: make entity_registry.research() local-only by default (#811)
* fix: make entity_registry.research() local-only by default

research() previously called _wikipedia_lookup() unconditionally,
sending entity names to en.wikipedia.org on every uncached lookup.
This violates the project's local-first and privacy-by-architecture
principles documented in CLAUDE.md.

Changes:
- research() now returns "unknown" for uncached words by default
- New allow_network=True parameter required for Wikipedia lookups
- Wikipedia 404 now returns "unknown" instead of asserting "person"
  with 0.70 confidence, preventing entity registry poisoning
- Added privacy warning docstring to _wikipedia_lookup()
- Added tests for local-only default, opt-in network, 404 handling,
  and cache-not-persisted-on-local-only behaviour

Refs: MemPalace/mempalace#809

* fix: improve research() cache read path and deduplicate test mocks

- Use .get() instead of .setdefault() for cache reads in research()
  so the local-only path never mutates _data unnecessarily
- Move .setdefault() to the network-write path only
- Use result.setdefault() for word/confirmed keys to ensure
  consistent return shape across all _wikipedia_lookup error paths
- Extract duplicated mock_result dict into _MOCK_SAOIRSE_PERSON
  constant shared by 3 test functions
2026-04-15 00:26:24 -07:00
mvalentsev d565718922 fix: address i18n review issues from PR #718
Three issues flagged by bensig on the i18n PR before merge:

1. ko.json: status_drawers used {drawers} instead of {count}, causing
   the Korean UI to show the raw template string instead of the actual
   drawer count.  All other 7 languages use {count}.

2. Test file was shipped inside the package at mempalace/i18n/test_i18n.py
   with a sys.path.insert hack.  Moved to tests/test_i18n.py per the
   project convention in AGENTS.md.

3. Dialect.from_config() passed lang=config.get("lang") which defaults
   to None, causing __init__ to inherit whatever language was loaded
   earlier via module-level state.  Now defaults to "en" explicitly so
   from_config is deterministic regardless of prior load_lang() calls.

Added two regression tests for the ko.json fix and the state leak.
2026-04-15 11:03:28 +05:00
Igor Lins e Silva 107685930d docs+tests: fix CI after README slim (#875)
The regression-guard tests added in #835 were pinned to the old
README shape (tool table + file-reference table). When #897 slimmed
the README and moved that content to the website, three tests
started failing:

  TestReadmeToolsExistInCode.test_every_readme_tool_exists_in_tools_dict
  TestNoUnlistedTools.test_no_undocumented_tools
  TestReadmeDialectNotLossless.test_readme_dialect_line_not_lossless

Changes in this commit:

1. Update the 3 tests to track the new canonical docs surfaces
   - Tool list -> website/reference/mcp-tools.md
     (tests parse `### \`mempalace_xxx\`` headings instead of
     markdown table rows).
   - dialect.py lossless disclaimer -> website/reference/modules.md
     (any line mentioning dialect.py must not also say "lossless").

2. Fix the website to make "no undocumented tools" true
   Add the 10 tools that existed in TOOLS but were missing from
   website/reference/mcp-tools.md (create_tunnel, delete_tunnel,
   follow_tunnels, list_tunnels, get_drawer, list_drawers,
   update_drawer, hook_settings, memories_filed_away, reconnect).
   Page header now correctly says "all 29 MCP tools".

3. Align pre-commit ruff pin to match CI (0.4.x)
   .pre-commit-config.yaml was pinning ruff v0.9.0, while
   .github/workflows/ci.yml installs ruff>=0.4.0,<0.5. The two
   formatters produce incompatible output (e.g. v0.9.0 reformats
   `assert (x), msg` -> `assert x, (msg)` in a way v0.4.x rejects),
   which would cause the pre-commit hook to modify files that CI
   then flags as unformatted. Pinning the hook to v0.4.10 keeps
   the dev loop and CI in lock-step.

Full suite: 887 passed, 0 failed.
2026-04-14 21:59:55 -03:00
MSL 3094c0bd10 fix: add missing self._lock to KnowledgeGraph.close()
TDD: test first, failed, fixed, passed.

Igor fixed query_relationship/timeline/stats in an earlier commit.
close() was the last method touching self._connection without
holding the lock.

Closes #883.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 13:09:10 -07:00
Igor Lins e Silva c9b3245994 Merge pull request #880 from MemPalace/perf-optimize-regex-compilation-15578943484596502942
 Optimize regex compilation in entity extraction
2026-04-14 15:10:34 -03:00
Milla J 3ac75d0fdb feat: add MEMPAL_VERBOSE toggle — developers see diaries in chat (#871)
export MEMPAL_VERBOSE=true  → hook blocks, agent writes diary in chat
export MEMPAL_VERBOSE=false → silent background save (default)

Developers need to see code and diaries being written.
Regular users want zero chat clutter. Now both work.

TDD: tests written first, failed, code fixed, tests pass.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:55:56 -07:00
google-labs-jules[bot] 21793cfb48 perf: optimize regex compilation in entity extraction
Move regular expression compilation to the module level in `dialect.py` to prevent repeated parsing during loop execution.

Co-authored-by: igorls <4753812+igorls@users.noreply.github.com>
2026-04-14 17:43:26 +00:00
Igor Lins e Silva 4741bc0055 Merge pull request #873 from sha2fiddy/feature/455/kg-sanitize-punctuation
fix: use permissive validator for KG entity values
2026-04-14 14:15:33 -03:00
Igor Lins e Silva e1d24d8087 Merge pull request #812 from Kesshite/fix/security-hook-injection
fix: harden hooks against shell injection, path traversal, and arithmetic injection
2026-04-14 14:10:33 -03:00
Matt Van Horn e8e93b53c0 fix: allow mining directories without local mempalace.yaml
When no mempalace.yaml or mempal.yaml exists in the source directory,
return a default config (wing = directory name, room = general) instead
of calling sys.exit(1). This lets users mine any directory into their
palace without requiring init first.

Closes #14.
2026-04-14 13:53:07 -03:00
eblander 79c9c0e517 fix: use permissive validator for KG entity values (closes #455)
sanitize_name rejects commas, colons, parentheses, and slashes — characters
that commonly appear in knowledge graph subject/object values. Adds
sanitize_kg_value for KG entity fields (subject, object, entity) while
keeping sanitize_name for predicates and wing/room names.
2026-04-14 09:26:47 -04:00
BLUDATA\marcio.heiderscheidt f7d703fd5b fix: add logging on rejected transcript paths and platform-native path test
- _count_human_messages() now logs a WARNING via _log() when a
  non-empty transcript_path is rejected by the validator, making
  silent auto-save failures diagnosable via hook.log
- Add test for platform-native paths (backslashes on Windows) to
  verify _validate_transcript_path works cross-platform
- Add test verifying the warning log is emitted on rejection

Refs: MemPalace/mempalace#809
2026-04-14 07:54:42 -03:00
BLUDATA\marcio.heiderscheidt 0f217f7c80 fix: harden hooks against shell injection, path traversal, and arithmetic injection
save_hook.sh:
- Coerce stop_hook_active to strict True/False before eval to prevent
  command injection via crafted JSON (e.g. "$(curl attacker.com)")
- Validate LAST_SAVE as plain integer with regex before bash arithmetic
  to prevent command substitution via poisoned state files

hooks_cli.py:
- Add _validate_transcript_path() that rejects paths with '..'
  components and non-.jsonl/.json extensions
- _count_human_messages() now uses the validator, returning 0 for
  invalid paths instead of opening arbitrary files

Tests:
- Path traversal rejection (../../etc/passwd)
- Wrong extension rejection (.txt, .py)
- Valid path acceptance (.jsonl, .json)
- Empty string handling
- Shell injection in stop_hook_active field

Refs: MemPalace/mempalace#809
2026-04-14 07:54:42 -03:00
Igor Lins e Silva 267a644f4f refactor: route all chromadb access through ChromaBackend
Prerequisite for RFC 001 (plugin spec, #743). Removes every direct
`import chromadb` outside the ChromaDB backend itself so the core
modules depend only on the backend abstraction layer.

Extends ChromaBackend with make_client, get_or_create_collection,
delete_collection, create_collection, and backend_version. Adds
update() to the BaseCollection contract. Non-backend callers
(mcp_server, dedup, repair, migrate, cli) now go through the
abstraction; tests patch ChromaBackend instead of chromadb.

With this landed, the RFC 001 spec can be enforced and PalaceStore
(#643) can ship as a plugin without touching core modules.
2026-04-14 00:31:16 -03:00
Milla J 045023f449 fix: save hook auto-mines transcript without MEMPAL_DIR (#840)
TDD: test written first, failed, then fixed.

Problem: save hook says "saved in background" but MEMPAL_DIR defaults
to empty, so nothing actually mines. Users get no auto-save despite
the hook firing every 15 messages.

Fix: use TRANSCRIPT_PATH (received from Claude Code in the hook's
JSON input) to discover the session directory. Mine that directory
automatically. MEMPAL_DIR is still supported as override but no
longer required.

Also fixed: bare python3 → $(command -v python3) for nohup safety.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:09:59 -07:00
Igor Lins e Silva 5320246297 Merge pull request #807 from sha2fiddy/fix/218-cosine-distance-metadata
Fix: set cosine distance metadata on all collection creation sites
2026-04-13 21:18:40 -03:00
Milla J 62df24599e fix: README audit — 42 TDD tests + hall detection + 7 claim fixes (#835)
* fix: README audit — match every claim to shipped code + add hall detection

TDD audit: wrote 42 tests verifying README claims against codebase.
Fixed all 7 failures:

1. Tool count: 19 → 29 (10 tools were undocumented)
2. Added tool table rows for tunnels, drawer management, system tools
3. Version badge: 3.1.0 → 3.2.0
4. dialect.py file reference: "30x lossless" → "AAAK index format for closet pointers"
5. Wake-up token cost: "~170 tokens" → "~600-900 tokens" (matches layers.py)
6. pyproject.toml version in project structure: v3.0.0 → v3.2.0
7. Hall detection: added detect_hall() to miner.py — drawers now tagged
   with hall metadata so palace_graph.py can build hall connections

New code:
- miner.py: detect_hall() — keyword scoring against config hall_keywords,
  writes hall field to every drawer's metadata
- tests/test_hall_detection.py — 12 TDD tests (written before code)
- tests/test_readme_claims.py — 42 TDD tests verifying README accuracy

859/859 tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve ruff lint — unused imports and variables

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: ruff format with CI-pinned 0.4.x

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use conftest fixtures in hall tests for Windows compat

Windows CI fails with NotADirectoryError when ChromaDB tries to
write HNSW files in short-lived TemporaryDirectory. Use conftest
palace_path and tmp_dir fixtures instead — same pattern as all
other tests that touch ChromaDB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address Igor's review — convo_miner halls, cached config, markdown typo

TDD: wrote tests for convo_miner hall metadata and config caching
BEFORE verifying the code changes.

1. README markdown typo: extra ** in wake-up token row (line 195)
2. convo_miner.py: added _detect_hall_cached() — conversation
   drawers now get hall metadata (was missing, Igor caught it)
3. miner.py + convo_miner.py: cached hall_keywords at module level
   so config.json isn't re-read per drawer during bulk mine
4. New tests: TestConvoMinerWritesHalls, TestDetectHallCaching

861/861 tests pass. ruff clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:11:11 -07:00
eblander 8dc5970ca9 Fix: ruff format with CI-pinned version (0.4.x) 2026-04-13 18:29:48 -04:00
Igor Lins e Silva 1dc20e307b test: verify mine_lock via disjoint critical-section intervals
The previous revision used multiprocessing but still relied on timing
("second process waited at least N seconds") which flakes on CI where
spawn overhead eats into the hold window. Linux CI observed the second
process report a 0.088s wait — below the 0.1s threshold — even though
the lock behavior was correct; spawn was just slow enough that the
first process had nearly finished holding when the second got past
its own spawn.

Switch to effect-based verification: each worker logs its
[enter_time, exit_time] inside the critical section, and the test
asserts the two intervals are disjoint after sorting. A broken lock
would produce overlapping intervals regardless of spawn latency; a
working lock cannot.

Also removed the mp.Queue since we no longer pass timing data back.
2026-04-13 19:08:57 -03:00
Igor Lins e Silva e052074624 test: serialize mine_lock concurrency test with multiprocessing
The macOS CI job failed ``test_lock_blocks_concurrent_access`` because
``fcntl.flock`` on BSD/macOS is per-*process*, not per-FD: two threads
in the same process both acquire even when they open their own file
descriptors. The test passed on Linux (per-FD flock) and Windows
(per-FD ``msvcrt.locking``) but was never actually exercising the
lock's real contract.

``mine_lock`` is designed to serialize multi-*agent* access — i.e.,
separate processes, not threads. Switch the test to
``multiprocessing.get_context('spawn')`` with a module-level worker
(so the spawn pickles cleanly) so it:

  1. reflects the actual use case (one lock per mining process);
  2. passes on all three OSes without flock-semantics branching;
  3. catches real regressions (a broken lock would now let both
     processes through, exactly what we care about).

Hold time bumped to 0.3s and the "wait until p1 acquires" delay to
0.2s to tolerate spawn's higher startup latency on macOS/Windows.
2026-04-13 19:02:51 -03:00
Igor Lins e Silva 7192552624 test: make diary state path assertion platform-neutral
The Windows CI job failed on:

    assert '/.mempalace/state/' in str(state_path)

because Windows uses ``\`` as the path separator, so the substring
never matches. The behavior under test (state file lives outside the
diary dir, under ``~/.mempalace/state/``) is already correct on both
platforms — only the assertion was Unix-only.

Switch to ``state_path.parent`` comparisons that work on any OS.
2026-04-13 18:55:36 -03:00
Igor Lins e Silva 6b7dcc53d4 merge: pr/closet-llm-generic + harden LLM regen path for production
Brings in PR #793 (optional LLM-based closet regeneration via
user-configured OpenAI-compatible endpoint) and PR #795 (hybrid
closet+drawer search — closets boost, never gate). Stack: #784#788#789#790#791#792#793 (+ #795).

Findings hardened on our side
─────────────────────────────

1) closet_llm.regenerate_closets didn't use the blessed palace helpers.

   Before:
     * manual closets_col.get(where=...) + .delete(ids=...) with a
       silent ``except Exception: pass`` around both — if the purge
       failed, pre-existing regex closets survived alongside fresh LLM
       closets, giving the searcher double hits for the same source.
     * ``source.split('/')[-1][:30]`` to build the closet_id — quietly
       wrong on Windows paths (``C:\\proj\\a.md`` has no ``/``, so the
       whole string ends up in the ID).
     * no mine_lock around purge+upsert — a concurrent regex rebuild of
       the same source could interleave with our purge and leave a mix
       of regex and LLM pointers.
     * no ``normalize_version`` stamp on the LLM closets — the miner's
       stale-version gate would treat them as leftovers from an older
       schema and rebuild over them on the next mine.

   After: routes through ``purge_file_closets`` + ``mine_lock`` +
   ``os.path.basename`` + ``NORMALIZE_VERSION`` stamp. Regression tests
   cover each.

2) searcher.search_memories was still closet-first.

   PR #795 merged into #793's head to fix the recall regression
   documented in that PR (R@1 0.25 on narrative content vs. 0.42
   baseline). The hybrid design makes closets a ranking boost rather
   than a gate: drawers are always queried at the floor, and matching
   closet hits (rank 0-4 within CLOSET_DISTANCE_CAP=1.5) add a boost
   of 0.40/0.25/0.15/0.08/0.04 to the effective distance.

   Merged to take the incoming hybrid design, with two cleanups:
   * kept the ``_expand_with_neighbors`` / ``_extract_drawer_ids_from_closet``
     helpers as separately-tested utilities (still imported by tests
     and future callers);
   * replaced the fragile ``source_file.endswith(basename)`` reverse-
     lookup in the enrichment step with internal ``_source_file_full``
     / ``_chunk_index`` fields stripped before return, so enrichment
     doesn't silently pick the wrong path when two sources share a
     basename across directories;
   * drawer-grep enrichment now sorts by ``chunk_index`` before
     neighbor expansion, so ``best_idx ± 1`` corresponds to actual
     document order rather than whatever order Chroma returned.

3) Closet-first tests in test_closets.py (``TestSearchMemoriesClosetFirst``,
   end-to-end ``test_closet_first_search_includes_drawer_index_and_total``)
   pinned contracts that the hybrid path now violates (``matched_via``
   went from ``"closet"`` to ``"drawer+closet"``). Rewrote them around
   the new invariant: direct drawers are always the floor, closet
   agreement flips the hit's matched_via and exposes closet_preview.

Verification
────────────

* 805/805 pass under ``uv run pytest tests/ -v --ignore=tests/benchmarks``
  (13 new tests from PR #793 + 5 from PR #795 + 2 new regressions for
  the closet_llm hardening + the rewritten hybrid assertions in
  test_closets.py).
* CI-pinned ruff 0.4.x clean on ``mempalace/`` + ``tests/`` (check +
  format both pass).
* No new deps — closet_llm.py still uses stdlib ``urllib.request`` per
  the PR's "zero new dependencies" promise.

Co-Authored-By: MSL <232237854+milla-jovovich@users.noreply.github.com>
2026-04-13 18:40:36 -03:00
Igor Lins e Silva 1263c3c91e merge: full hardened stack + rewrite fact_checker around actual KG API
Merges the full hardened stack (up through #791 drawer-grep) and turns
fact_checker from "dead code hidden behind bare except" into an
actually-working offline contradiction detector with tests.

## Dead paths the PR body advertised but the code never executed

Both buried by a single outer ``except Exception: pass``:

  * ``kg.query(subject)`` — ``KnowledgeGraph`` has no ``query()`` method;
    it has ``query_entity()``. The attribute error was silently swallowed
    and the entire KG branch always returned ``[]``. Now using
    ``kg.query_entity(subject, direction="outgoing")`` with proper
    handling of the ``predicate``/``object``/``current``/``valid_to``
    fields the real API returns.
  * ``KnowledgeGraph(palace_path=palace_path)`` — the constructor's only
    kwarg is ``db_path``. Passing ``palace_path`` raised TypeError,
    silently swallowed. Now computing the db_path correctly from
    ``<palace>/knowledge_graph.sqlite3``, matching the convention the
    MCP server already uses.

## Contradiction logic rewritten

The previous ``if kg_pred in claim and fact.object not in claim`` only
fired when text used the SAME predicate word as the KG fact — the exact
opposite of the stated use case ("Bob is Alice's brother" when KG says
husband" would NOT have fired). Replaced with a proper parse → lookup
→ compare pipeline:

  * ``_extract_claims`` parses two surface forms ("X is Y's Z" and
    "X's Z is Y") into ``(subject, predicate, object)`` triples.
  * ``_check_kg_contradictions`` pulls the subject's outgoing facts
    and flags two classes:
      - ``relationship_mismatch`` when a current KG fact matches the
        same ``(subject, object)`` pair but with a different predicate.
      - ``stale_fact`` when the exact triple exists but is
        ``valid_to``-closed in the past.
  * Stale-fact detection is now implemented (the PR body claimed it;
    the old code silently didn't implement it).

## Performance fix — O(n²) → O(mentioned × n)

``_check_entity_confusion`` previously computed Levenshtein for every
pair of registered names on every ``check_text`` call. For 1,000
registered names that's ~500K edit-distance calls per hook invocation.
Now we first identify which registry names actually appear in the text
(single regex scan), then only compute edit distance between mentioned
and unmentioned names. Pinned by a test that asserts <200ms on a 500-
name registry with zero mentions.

Also: when *both* similar names are mentioned in the text, we no
longer flag them — the user clearly knows they're different people.

## Shared entity-registry loader

``mempalace/miner.py`` already had an mtime-cached loader for
``~/.mempalace/known_entities.json``. fact_checker had a duplicate
implementation that leaked file handles and ignored caching. Extended
miner's cache to expose both the flat set (``_load_known_entities``)
and the raw category dict (``_load_known_entities_raw``); fact_checker
now imports the latter. No more double disk reads, no more handle leak.

## Tests — 24 cases in tests/test_fact_checker.py

All three detection paths + both dead-code regressions:
  * ``test_kg_init_uses_db_path_not_palace_path_kwarg`` — pins the
    correct KG constructor signature so the ``palace_path=`` bug can't
    come back.
  * ``test_relationship_mismatch_detected`` — the headline example from
    the PR body now actually fires.
  * ``test_stale_fact_detected`` — valid_to-closed triple is flagged.
  * ``test_current_fact_same_triple_is_not_flagged`` — no false positive
    on a still-valid match.
  * ``test_performance_bounded_by_mentioned_names`` — 500-name registry,
    zero mentions, <200ms. Regression for the O(n²) blowup.
  * ``test_no_false_positive_when_both_names_mentioned`` — Mila and
    Milla in the same text is fine.
  * Plus claim extraction, flatten_names shapes, CLI exit code, empty
    text handling, missing-palace graceful fallback, registry-dict
    shape support.

785/785 suite pass. ruff + format clean on CI-pinned 0.4.x.
2026-04-13 18:20:11 -03:00