mempalace

Author	SHA1	Message	Date
eldar702	5347c2c71c	fix(searcher): clamp effective_distance to valid cosine range [0, 2] ``search_memories`` computes ``effective_dist = dist - boost`` where ``boost`` can be as large as ``CLOSET_RANK_BOOSTS[0] == 0.40`` for a rank-0 closet hit. When the raw drawer distance is small — any near-exact match — the subtraction goes negative. Two downstream effects: 1. Line 418 returns ``round(max(0.0, 1 - effective_dist), 3)`` as ``similarity``. With ``effective_dist = -0.30`` that yields ``similarity = 1.30``, outside the documented ``[0, 1]`` range. The ``max(0.0, ...)`` only prevents negative similarities; it does not cap above 1. 2. Line 427 stores ``_sort_key: effective_dist`` and line 435 sorts ``scored`` ascending by that key. A negative key drops below the rest, so the strongest hybrid matches end up sorting after weaker ones — ranking inversion under the exact conditions hybrid retrieval is supposed to serve best. Clamp ``effective_dist`` to the valid cosine-distance range ``[0, 2]``. The boost still wins (closet-backed hit still ranks first), it just no longer flips the order. Test added: mock drawer_col (base dist 0.08 / 0.35 for two sources) + closet_col (rank-0 closet for the 0.08 source) → assert all hits have ``0 <= similarity <= 1`` and ``0 <= effective_distance <= 2``, and that the closet-boosted source still ranks first. Relationship to other PRs: * #988 clamps the output ``similarity`` alone. That does not fix the sort-key inversion or the invalid ``effective_distance`` in the returned dict. This PR clamps at the arithmetic source so both downstream users of the value stay in range. * Orthogonal to #979 (``tool_check_duplicate`` negative similarity).	2026-05-06 02:19:54 -03:00
Chris Antenesse	733e435332	fix(searcher): guard against None metadata/doc in search result loops ChromaDB can return None entries in metadatas/documents lists under partial-flush, mid-delete, upgrade-boundary, and interrupted-mine states. Add `meta = meta or {}` and `doc = doc or ""` guards in the three result loops (search display, closet hybrid, drawer scored) so .get() and .strip() calls never crash on None. Fixes #1007, #1011	2026-05-06 01:59:24 -03:00
Igor Lins e Silva	46d9eb5df0	Merge pull request #1375 from MemPalace/fix/lint-e402-test-hooks-cli fix(lint): hoist hooks_cli_mod import to top of test_hooks_cli (E402)	2026-05-06 01:58:29 -03:00
Igor Lins e Silva	f854da779f	fix(lint): hoist hooks_cli_mod import to top of test_hooks_cli (E402) The alias was placed below an explanatory comment block introduced by #1305, which trips ruff E402 (module-level import not at top of file). Moved next to the existing 'from mempalace.hooks_cli import (...)' line. CI lint went red on develop after #1305 merged with the failing check; this re-greens it so subsequent PRs do not inherit the failure.	2026-05-06 01:57:44 -03:00
Igor Lins e Silva	67cda9d455	Merge pull request #1030 from eldar702/fix/none-metadata-residual-guards fix: guard None metadata/doc in tool_check_duplicate and Layer1/Layer2	2026-05-06 01:51:24 -03:00
Igor Lins e Silva	0c8314f919	Merge pull request #1060 from alonehobo/fix/stdio-utf8 fix(mcp): force UTF-8 on stdio to fix -32000 on non-ASCII payloads (Windows)	2026-05-06 01:49:56 -03:00
Igor Lins e Silva	642a073305	Merge pull request #1114 from Sathvik-1007/fix/list-drawers-pagination-total fix: add total count to tool_list_drawers pagination response	2026-05-06 01:49:37 -03:00
Igor Lins e Silva	d9ab5b7fd3	Merge pull request #1305 from lcatlett/upstream/respect-absent-palace-dir fix(hooks): treat absent ~/.mempalace as auto-save off	2026-05-06 01:49:22 -03:00
Igor Lins e Silva	ea6f2c0c4c	Merge pull request #1162 from imtylervo/fix/palace-write-lock-queue-pattern fix: serialize ChromaCollection writes through palace lock	2026-05-06 01:48:51 -03:00
Igor Lins e Silva	d1e27b8c42	style: ruff format new test files (CI lint)	2026-05-06 01:47:46 -03:00
Igor Lins e Silva	5ae83d8ec3	Merge pull request #1370 from MemPalace/docs/changelog-v3.3.5-batch1 docs(changelog): batch entries for 7 v3.3.5 fixes	2026-05-06 01:40:21 -03:00
Igor Lins e Silva	2c0ef2c04e	docs(changelog): document v3.3.5 fixes from #1214 #1105 #1215 #1107 #1282 #1167 #1160 Bundled CHANGELOG entries for the seven Tier-1 PRs merged today, including the behavior-change call-out for #1167 (KG date validators now reject non-ISO inputs that previously produced silent empty results).	2026-05-06 01:38:57 -03:00
Igor Lins e Silva	53675dd194	Merge pull request #1160 from mvalentsev/fix/mcp-kg-lazy-per-path-cache fix(mcp): lazy per-path KnowledgeGraph cache (#1136)	2026-05-06 01:33:47 -03:00
Igor Lins e Silva	7ede231da9	Merge pull request #1167 from arnoldwender/fix/kg-date-validation fix(kg): validate ISO-8601 date formats at MCP boundary	2026-05-06 01:33:27 -03:00
Igor Lins e Silva	3824ea610c	Merge pull request #1282 from mvalentsev/fix/fact-checker-stdio-utf8 fix(cli, fact-checker): reconfigure stdio to UTF-8 on Windows	2026-05-06 01:33:15 -03:00
Igor Lins e Silva	778f830cd0	Merge pull request #1107 from sha2fiddy/fix/1073-closet-llm-paginate fix: paginate closet_llm col.get (#1073)	2026-05-06 01:33:04 -03:00
Igor Lins e Silva	e18981a527	Merge pull request #1215 from arnoldwender/fix/entity-registry-atomic-write fix(entity_registry): atomic write to prevent partial corruption on crash	2026-05-06 01:32:46 -03:00
Igor Lins e Silva	ef0e45ad92	Merge pull request #1105 from mvalentsev/fix/chroma-backend-close-releases-lock fix(backends/chroma): release SQLite file lock on close_palace/close (#1067)	2026-05-06 01:32:30 -03:00
Igor Lins e Silva	0cfb4b3ef1	Merge pull request #1214 from arnoldwender/fix/kg-temporal-inversion-guard fix(kg): reject inverted intervals in add_triple (valid_to < valid_from)	2026-05-06 01:32:16 -03:00
fatkobra	6b042982e8	fix(repair): preflight SQLite integrity before rebuild	2026-05-05 15:51:43 +00:00
fatkobra	bb40a529fd	fix(migrate): verify write roundtrip before bailout	2026-05-05 09:01:05 +00:00
fatkobra	37e7d394b8	fix(repair): preflight poisoned max_seq_id	2026-05-05 07:22:10 +00:00
fatkobra	eff844b168	fix(storage): quarantine partial HNSW flush without metadata	2026-05-04 10:34:22 +00:00
Arnold Wender	2e441d17a2	fix(entity_registry): fsync parent dir after rename for ext4 durability Without this, on ext4 (and similar) filesystems the rename ack does not guarantee durability across power loss — a crash can revert to a state where the temp file is present and the target is at the old version. Suggested by @jphein on #1215.	2026-05-04 11:08:14 +02:00
Arnold Wender	4f36145c2e	fix(entity_registry): atomic write to prevent partial corruption on crash EntityRegistry.save() called Path.write_text() directly, which truncates the target file and then writes — so a crash mid-write (power loss, OOM, filesystem-full mid-flush) leaves an empty or half-written entity_registry.json. The whole people/projects map is lost; the system falls back to an empty registry on next load. Switch to the standard atomic-write pattern: serialize to a sibling .tmp file in the same directory (so os.replace stays on one filesystem), fsync, chmod 0o600, then os.replace over the target. The replace is atomic on POSIX and Windows, so any crash leaves the previous registry intact instead of a truncated file. Tests cover: no leftover .tmp on success, and previous content preserved when os.replace itself raises mid-save.	2026-05-04 11:08:14 +02:00
fatkobra	f9d939ae1b	fix(storage): quarantine bloated HNSW link payloads	2026-05-04 06:45:29 +00:00
mvalentsev	285b3b4f2e	refactor(stdio): extract Windows UTF-8 reconfigure into shared helper Both cli.py and fact_checker.py carried identical 28-line Windows stdio reconfigure helpers; pull the loop into mempalace/_stdio.py so the same machine drives the CLI, the fact_checker --stdin entry point, and the MCP server. The thin per-call-site wrappers stay so existing tests keep importing _reconfigure_stdio_utf8_on_windows from the same module they always have. CLI / fact_checker policy unchanged: stdin=surrogateescape (don't crash on a malformed redirected file), stdout/stderr=replace (don't crash mid-print on a surrogate half round-tripped from a filename).	2026-05-03 22:25:31 +05:00
mvalentsev	75ad8ae781	ci: retrigger linux 3.13 (transient onnx download flake)	2026-05-03 22:04:22 +05:00
mvalentsev	b8816e0fe2	fix(mcp): retry KG handlers once on concurrent close race Race scenario: a KG tool handler calls _get_kg() and gets a live KnowledgeGraph; another thread fires tool_reconnect() between that return and the handler's kg.add_triple()/kg.query_entity()/etc call. tool_reconnect drains _kg_by_path and closes the underlying sqlite3.Connection; the handler then raises sqlite3.ProgrammingError: 'Cannot operate on a closed database', which surfaces as a -32000 to the MCP client even though the user just asked for a reconnect. New _call_kg(op) helper wraps each handler's kg call in a one-shot retry: catch exactly sqlite3.ProgrammingError, evict the stale entry (only if the cache slot still points at the closed instance — another thread may have already replaced it), and rerun op against a fresh _get_kg(). Beyond one retry give up so a sustained close-stream surfaces clearly instead of looping. All five KG handlers (tool_kg_query, tool_kg_add, tool_kg_invalidate, tool_kg_timeline, tool_kg_stats) now route through _call_kg. Tests pin the contract: * retries with a fresh KG and returns the second result * non-ProgrammingError exceptions propagate without retry * gives up after exactly one retry on sustained close	2026-05-03 21:43:51 +05:00
mvalentsev	03643eb507	fix(cli, fact-checker): per-stream stdio errors policy on Windows Previously all three streams reconfigured to UTF-8 with errors='strict'. That kills 'mempalace search' the moment a drawer carrying a surrogate half (round-tripped from a filename via surrogateescape) hits print(), losing the rest of the result block. Same hazard for warning lines on stderr. Split the policy: stdin -> surrogateescape (malformed bytes from a redirected file survive as lone surrogates instead of crashing the read) stdout -> replace (drawer text with a stray surrogate becomes U+FFFD instead of UnicodeEncodeError mid-print) stderr -> replace (same protection for logger / warning paths) Applied identically in the cli.py and fact_checker.py helpers; the DRY extraction into a shared module is a separate cleanup ask, kept out of this fix to keep the diff narrow. Tests updated for the new per-stream assertion.	2026-05-03 21:37:12 +05:00
mvalentsev	32f4dfa26d	fix(cli): reconfigure stdio to UTF-8 on Windows The primary `mempalace` console_script (`cli.py:main()`) reads non-ASCII arguments via piped stdin and writes verbatim drawer text / wing names through `print()`. On Windows, Python defaults stdio to the system ANSI codepage (cp1252/cp1251/cp950), so: - `mempalace search "..." > out.txt` mojibakes any drawer text containing non-Latin characters - `mempalace ... < input.txt` mojibakes piped non-ASCII input Reconfigure stdin/stdout/stderr to UTF-8 (`errors="strict"`) at the top of `main()`, mirroring the helper added in this PR for fact_checker's `__main__` block. Wrapped in try/except so a replaced stream (Jupyter, test harness) logs a warning and continues rather than crashing the CLI. The reconfigure cascades through every `mempalace` subcommand (`init`/`mine`/`search`/`status`/`hook`/etc.) and through the interactive flows that read non-ASCII names via `input()` (onboarding, entity detector, room detector). With this commit the package's three user-facing entry points (`mempalace`, `mempalace-mcp`, and `python -m mempalace.fact_checker`) all reconfigure stdio identically on Windows.	2026-05-03 21:33:54 +05:00
mvalentsev	7cee74c8c8	fix(fact-checker): reconfigure stdio to UTF-8 on Windows The `python -m mempalace.fact_checker --stdin` entry point reads non-ASCII text through the system ANSI codepage (cp1252/cp1251/cp950) on Windows, which mojibakes characters before claim-extraction sees them. Reconfigure stdin/stdout/stderr to UTF-8 with `errors="strict"`, wrapped in try/except so a replaced stream (Jupyter, test harness) logs a warning rather than crashing the CLI. Mirrors the same fix shipped for `mcp_server.py:main()` (#400) and `hooks_cli.py:run_hook()` (#1280) -- this is the third and last stdin-reading entry point in the package.	2026-05-03 21:33:54 +05:00
mvalentsev	45df1a2657	fix(backends/chroma): release SQLite file lock on close_palace/close (#1067 ) ChromaBackend.close_palace() and close() evicted cached PersistentClients from self._clients without calling client.close(), so chromadb 1.5.x kept the rust-side SQLite file lock until GC. Reopening the same palace path after shutil.rmtree + re-create within one process then failed with SQLITE_READONLY_DBMOVED (SQLite code 1032). Add _close_client() helper with a try/except fallback for older chromadb, and route close_palace(), close(), and the DB-file-missing invalidation branch of _client() through it. The mtime/inode auto-invalidation branch is left as-is: callers there may still hold a live ChromaCollection handle, and closing out from under them clears the rust bindings mid-use. Regression tests cover close_palace reopen-same-path and whole-backend close for multiple palaces.	2026-05-03 19:16:25 +05:00
mvalentsev	0a62658051	fix(mcp): drain KG cache on tool_reconnect tool_reconnect cleared ChromaDB caches but left _kg_by_path entries intact. After an external replacement of knowledge_graph.sqlite3 the server kept serving the old open sqlite3.Connection, returning stale results. Now iterate _kg_by_path under _kg_cache_lock, call close() best-effort, and clear the dict so the next tool call reopens the KG from disk. Two new tests in TestKGLazyCache verify cache invalidation and that a failing close() does not block the clear.	2026-05-03 17:43:00 +05:00
mvalentsev	19f8a4ff68	style(mcp): drop issue-tracker comments from KG cache block Inline comments referencing #1136 and #540 add no information the identifiers do not already convey. PR description carries the context; code stays quiet.	2026-05-03 17:43:00 +05:00
mvalentsev	84f9726a39	test(mcp): fix Windows subprocess env in KG lazy-init test Passing a stripped env dict without SYSTEMROOT/WINDIR breaks Python bootstrap on Windows (_Py_HashRandomization_Init). Inherit the parent env and strip MEMPAL* vars instead, then override HOME/USERPROFILE to the tmp dir.	2026-05-03 17:43:00 +05:00
mvalentsev	c69a622a18	test(mcp): add multi-tenant and lazy-init tests for KG (#1136 ) TestKGLazyCache covers the scenarios behind the lazy per-path refactor: - test_lazy_init_no_import_side_effect: a fresh subprocess import does not create ~/.mempalace/knowledge_graph.sqlite3 (what closed PR #167 was aiming at). - test_get_kg_returns_same_instance: two _get_kg() calls under the same resolved path return the same object, cache has one entry. - test_get_kg_different_paths_different_instances: rotating env var produces distinct KGs. - test_multi_tenant_env_switch: the exact scenario from #1136 — write under path A, query under path B returns empty, switching back to A sees the fact. - test_cache_thread_safe: 16 threads racing _get_kg() end up with one shared instance and one cache entry.	2026-05-03 17:43:00 +05:00
mvalentsev	9e730098e9	test(mcp): migrate _kg monkeypatches to _get_kg (#1136 ) Direct module-attribute patching of _kg is obsolete after the lazy cache refactor. Switch test helpers to patch _get_kg instead so the fixture KG replaces the factory rather than a now-missing singleton. - tests/test_mcp_server.py: _patch_mcp_server helper - tests/benchmarks/test_mcp_bench.py: _patch_mcp_config helper - tests/benchmarks/test_memory_profile.py: inline patch in test_tool_status_repeated_calls	2026-05-03 17:43:00 +05:00
mvalentsev	beac5d9954	refactor(mcp): replace eager _kg with lazy per-path cache (#1136 ) Swap the module-level KnowledgeGraph singleton for a lazy, per-path cache keyed by the resolved sqlite path. Import no longer creates a sqlite file as a side effect, and MCP servers started with --palace now route KG calls to the correct tenant when MEMPALACE_PALACE_PATH changes between calls, matching the per-call behavior of _get_client() on the ChromaDB side. Default-path behavior is preserved: without --palace at startup, KG stays on DEFAULT_KG_PATH regardless of env var. The "no --palace but env var set" case is #540's scope and is not changed here.	2026-05-03 17:43:00 +05:00
Igor Lins e Silva	1888b671e2	Merge pull request #1321 from MemPalace/fix/1313-init-palace-flag fix(cli): honor --palace flag in cmd_init (#1313)	2026-05-03 03:54:06 -03:00
Igor Lins e Silva	a91b7ee5c2	test(cli): prime monkeypatch undo so palace env doesn't leak monkeypatch.delenv(name, raising=False) on a missing key registers no undo entry, so the env var cmd_init writes leaked into test_config_from_file on Python 3.13 / Windows / macOS. Prime the slot with setenv before delenv so teardown rolls back the write.	2026-05-03 06:27:37 -03:00
Igor Lins e Silva	5380189f82	Merge pull request #1320 from MemPalace/fix/1314-kg-temporal-params fix(mcp): forward valid_to and source params in kg_add/kg_invalidate (#1314)	2026-05-03 03:51:29 -03:00
Igor Lins e Silva	0e65c54978	docs(mcp): drop §5.5 from kg_add docstring/schema The repo's anti-jargon meta-test bans §N markers outside the sources/backends allowlist. mcp_server.py isn't allowlisted, so the "RFC 002 §5.5" references added in this PR turned the test red. Trim to "RFC 002" — section number isn't load-bearing for the description.	2026-05-03 06:28:12 -03:00
Igor Lins e Silva	2ad379b547	Merge pull request #1306 from MemPalace/feat/hybrid-candidate-union feat(searcher): candidate_strategy="union" — BM25 candidates joined with vector pool before hybrid rerank	2026-05-03 03:40:51 -03:00
Igor Lins e Silva	3eb7980e55	fix(searcher): address Copilot review on #1306 - Dedup union candidates by (full_path, chunk_index), not basename — two files sharing a basename in different dirs no longer collide, and a vector hit on chunk N of a file no longer blocks BM25 from contributing chunk M of the same file. - Validate candidate_strategy at the top of search_memories so invalid values fail consistently, not only when the call routes through the vector path. - Trim hits back to n_results after the union+rerank pool grows; preserves the existing search_memories size contract that the MCP limit parameter is built on. - Skip BM25-only injection when max_distance > 0.0; BM25-only candidates carry distance=None and would silently bypass the caller's strict vector-distance threshold. Adds 4 tests covering: validation under vector_disabled, n_results trim, max_distance honoring, and basename-collision dedup.	2026-05-03 06:09:10 -03:00
Igor Lins e Silva	3e6f6480c0	Merge pull request #1325 from MemPalace/security/mcp-omit-absolute-paths fix(mcp): omit absolute filesystem paths from MCP tool responses	2026-05-03 03:20:11 -03:00
Igor Lins e Silva	7fc260f752	fix(mcp): basename source_file in tool_get_drawer responses The MCP `mempalace_get_drawer` tool returned the entire raw drawer metadata blob to any connected client, and the `source_file` field in that blob is the absolute filesystem path written by the miners (`miner.py`, `convo_miner.py` — `source_file = str(filepath)`). On a single-user local deployment this is self-disclosure, but in nested-agent or multi-server MCP topologies the client is a separate trust domain and the host's directory layout has no documented client-side use. Mirror the mitigation that `searcher.search_memories()` already applies on its own return path: reduce `source_file` to its basename via `Path(source_file).name` before handing the metadata to the client. Citations still work — the directory layout does not leak. Companion to #1 (omit palace_path from tool_status). Same threat class, different surface: - mempalace_status — palace dir path → fixed in #1 - mempalace_get_drawer — per-drawer source_file path → this PR Other read tools were audited and do not leak host paths: - mempalace_search — already basenames source_file - mempalace_list_drawers — returns wing/room/preview only - mempalace_diary_read — date/timestamp/topic/content only - mempalace_reconnect — success/message/drawers only - mempalace_kg_* — entity/predicate strings, counts - mempalace_check_duplicate — wing/room/preview only Changes: - mempalace/mcp_server.py: tool_get_drawer() now basenames metadata.source_file - tests/test_mcp_server.py: regression test asserting the absolute path and its parent directory do not appear anywhere in the response - website/reference/mcp-tools.md: clarify the documented return shape	2026-05-03 05:58:46 -03:00
icciAaron	b2f259c253	fix(mcp): omit palace_path from tool_status responses (+ docs) The MCP `mempalace_status` tool was returning the server's absolute `_config.palace_path` to any connected client on both the main (ChromaDB-backed) path and the sqlite fallback path that runs when HNSW divergence is detected (#1222). On a single-user local deployment this is self-disclosure, but in nested-agent or multi-server MCP topologies the client is a separate trust domain and the absolute path has no documented client-side use. Clients that legitimately need the palace path continue to have three documented channels: the `MEMPALACE_PALACE_PATH` env var (primary) or its legacy `MEMPAL_PALACE_PATH` alias, the `~/.mempalace/config.json` file, and the `--palace` CLI flag on most subcommands. Also corrects stale docs that claimed `mempalace_reconnect` returned a `palace_path` field; the code returns `{success, message, drawers, vector_disabled[, vector_disabled_reason]}` on success, plus a no-palace shape and an exception shape. - mempalace/mcp_server.py: drop palace_path from tool_status() and _tool_status_via_sqlite() result dicts - website/reference/mcp-tools.md: update documented return shapes for mempalace_status (fix) and mempalace_reconnect (stale-docs correction) Authored-by: Aaron Salsitz (ICCI LLC, @icciaaron). Claude Code was used as an authoring and review-orchestration tool, with human-in-the-loop oversight at every step: Aaron wrote the prompts, reviewed each draft, called for three independent review passes (drafting / post-rebase technical / CISA-aligned disclosure-leak), and verified the final patch behavior before commit.	2026-05-03 05:58:46 -03:00
Igor Lins e Silva	6f88b2a34e	Merge pull request #1322 from MemPalace/fix/1121-1132-1263-client-quarantine fix(backends/chroma): wire quarantine_stale_hnsw into _client() (#1121 #1132 #1263)	2026-05-03 03:18:28 -03:00
Igor Lins e Silva	a690eb398f	Merge pull request #1323 from MemPalace/fix/1243-diary-case-insensitive fix(mcp): case-insensitive agent name in diary read/write (#1243)	2026-05-03 03:18:11 -03:00

1 2 3 4 5 ...

815 Commits