Commit Graph

825 Commits

Author SHA1 Message Date
Igor Lins e Silva f4617b3d83 Merge pull request #1029 from eldar702/fix/searcher-effective-distance-clamp
fix(searcher): clamp effective_distance to valid cosine range [0, 2]
2026-05-06 03:20:59 -03:00
bobo-xxx f2bed9284f fix(layers): clamp similarity to [0,1] to avoid negative values 2026-05-06 02:20:47 -03:00
bobo-xxx eef053d750 fix(mcp_server): clamp similarity to [0,1] to avoid negative values 2026-05-06 02:20:47 -03:00
Igor Lins e Silva 74288f1cdd style: ruff format mcp_server.py (CI lint) 2026-05-06 02:20:00 -03:00
黄祖鑫(940219) 7b49478ef7 fix: MCP server JSON output ensure_ascii=False for non-ASCII support
Without ensure_ascii=False, non-ASCII characters (e.g. Chinese) in tool
results and JSON-RPC responses are escaped as \uXXXX, which causes
downstream MCP clients to receive escaped text instead of the original
characters. This affects all platforms, not just Windows.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 02:20:00 -03:00
Igor Lins e Silva 869ab38095 style: ruff format mcp_server.py + test_mcp_server.py (CI lint) 2026-05-06 02:19:57 -03:00
Oleksii Pylypchuk a85d432b54 feat: add validation for missing name parameter in tools/call requests 2026-05-06 02:19:57 -03:00
Oleksii Pylypchuk 55d79dc8cd fix: include null id in JSON-RPC invalid request error responses and add validation tests 2026-05-06 02:19:57 -03:00
Oleksii Pylypchuk 0fdb480e12 fix(mcp): handle null JSON-RPC request payloads safely
When the MCP client sends a malformed or null top-level request, prevent the AttributeError on request.get() by explicitly validating that the request is a dictionary. Returns standard JSON-RPC Error -32600 (Invalid Request) instead of crashing the server.
2026-05-06 02:19:57 -03:00
Igor Lins e Silva aac8437979 style: ruff format tests/test_searcher.py (CI lint) 2026-05-06 02:19:54 -03:00
eldar702 5347c2c71c fix(searcher): clamp effective_distance to valid cosine range [0, 2]
``search_memories`` computes ``effective_dist = dist - boost`` where
``boost`` can be as large as ``CLOSET_RANK_BOOSTS[0] == 0.40`` for a
rank-0 closet hit. When the raw drawer distance is small — any
near-exact match — the subtraction goes negative.

Two downstream effects:

1. Line 418 returns ``round(max(0.0, 1 - effective_dist), 3)`` as
   ``similarity``. With ``effective_dist = -0.30`` that yields
   ``similarity = 1.30``, outside the documented ``[0, 1]`` range.
   The ``max(0.0, ...)`` only prevents negative similarities; it does
   not cap above 1.
2. Line 427 stores ``_sort_key: effective_dist`` and line 435 sorts
   ``scored`` ascending by that key. A negative key drops *below* the
   rest, so the strongest hybrid matches end up sorting after weaker
   ones — ranking inversion under the exact conditions hybrid retrieval
   is supposed to serve best.

Clamp ``effective_dist`` to the valid cosine-distance range ``[0, 2]``.
The boost still wins (closet-backed hit still ranks first), it just no
longer flips the order.

Test added: mock drawer_col (base dist 0.08 / 0.35 for two sources) +
closet_col (rank-0 closet for the 0.08 source) → assert all hits have
``0 <= similarity <= 1`` and ``0 <= effective_distance <= 2``, and that
the closet-boosted source still ranks first.

Relationship to other PRs:

* **#988** clamps the output ``similarity`` alone. That does not fix
  the sort-key inversion or the invalid ``effective_distance`` in the
  returned dict. This PR clamps at the arithmetic source so both
  downstream users of the value stay in range.
* Orthogonal to **#979** (``tool_check_duplicate`` negative similarity).
2026-05-06 02:19:54 -03:00
Chris Antenesse 733e435332 fix(searcher): guard against None metadata/doc in search result loops
ChromaDB can return None entries in metadatas/documents lists under
partial-flush, mid-delete, upgrade-boundary, and interrupted-mine
states. Add `meta = meta or {}` and `doc = doc or ""` guards in the
three result loops (search display, closet hybrid, drawer scored) so
.get() and .strip() calls never crash on None.

Fixes #1007, #1011
2026-05-06 01:59:24 -03:00
Igor Lins e Silva 46d9eb5df0 Merge pull request #1375 from MemPalace/fix/lint-e402-test-hooks-cli
fix(lint): hoist hooks_cli_mod import to top of test_hooks_cli (E402)
2026-05-06 01:58:29 -03:00
Igor Lins e Silva f854da779f fix(lint): hoist hooks_cli_mod import to top of test_hooks_cli (E402)
The alias was placed below an explanatory comment block introduced by
#1305, which trips ruff E402 (module-level import not at top of file).
Moved next to the existing 'from mempalace.hooks_cli import (...)' line.

CI lint went red on develop after #1305 merged with the failing check;
this re-greens it so subsequent PRs do not inherit the failure.
2026-05-06 01:57:44 -03:00
Igor Lins e Silva 67cda9d455 Merge pull request #1030 from eldar702/fix/none-metadata-residual-guards
fix: guard None metadata/doc in tool_check_duplicate and Layer1/Layer2
2026-05-06 01:51:24 -03:00
Igor Lins e Silva 0c8314f919 Merge pull request #1060 from alonehobo/fix/stdio-utf8
fix(mcp): force UTF-8 on stdio to fix -32000 on non-ASCII payloads (Windows)
2026-05-06 01:49:56 -03:00
Igor Lins e Silva 642a073305 Merge pull request #1114 from Sathvik-1007/fix/list-drawers-pagination-total
fix: add total count to tool_list_drawers pagination response
2026-05-06 01:49:37 -03:00
Igor Lins e Silva d9ab5b7fd3 Merge pull request #1305 from lcatlett/upstream/respect-absent-palace-dir
fix(hooks): treat absent ~/.mempalace as auto-save off
2026-05-06 01:49:22 -03:00
Igor Lins e Silva ea6f2c0c4c Merge pull request #1162 from imtylervo/fix/palace-write-lock-queue-pattern
fix: serialize ChromaCollection writes through palace lock
2026-05-06 01:48:51 -03:00
Igor Lins e Silva d1e27b8c42 style: ruff format new test files (CI lint) 2026-05-06 01:47:46 -03:00
Igor Lins e Silva 5ae83d8ec3 Merge pull request #1370 from MemPalace/docs/changelog-v3.3.5-batch1
docs(changelog): batch entries for 7 v3.3.5 fixes
2026-05-06 01:40:21 -03:00
Igor Lins e Silva 2c0ef2c04e docs(changelog): document v3.3.5 fixes from #1214 #1105 #1215 #1107 #1282 #1167 #1160
Bundled CHANGELOG entries for the seven Tier-1 PRs merged today, including
the behavior-change call-out for #1167 (KG date validators now reject
non-ISO inputs that previously produced silent empty results).
2026-05-06 01:38:57 -03:00
Igor Lins e Silva 53675dd194 Merge pull request #1160 from mvalentsev/fix/mcp-kg-lazy-per-path-cache
fix(mcp): lazy per-path KnowledgeGraph cache (#1136)
2026-05-06 01:33:47 -03:00
Igor Lins e Silva 7ede231da9 Merge pull request #1167 from arnoldwender/fix/kg-date-validation
fix(kg): validate ISO-8601 date formats at MCP boundary
2026-05-06 01:33:27 -03:00
Igor Lins e Silva 3824ea610c Merge pull request #1282 from mvalentsev/fix/fact-checker-stdio-utf8
fix(cli, fact-checker): reconfigure stdio to UTF-8 on Windows
2026-05-06 01:33:15 -03:00
Igor Lins e Silva 778f830cd0 Merge pull request #1107 from sha2fiddy/fix/1073-closet-llm-paginate
fix: paginate closet_llm col.get (#1073)
2026-05-06 01:33:04 -03:00
Igor Lins e Silva e18981a527 Merge pull request #1215 from arnoldwender/fix/entity-registry-atomic-write
fix(entity_registry): atomic write to prevent partial corruption on crash
2026-05-06 01:32:46 -03:00
Igor Lins e Silva ef0e45ad92 Merge pull request #1105 from mvalentsev/fix/chroma-backend-close-releases-lock
fix(backends/chroma): release SQLite file lock on close_palace/close (#1067)
2026-05-06 01:32:30 -03:00
Igor Lins e Silva 0cfb4b3ef1 Merge pull request #1214 from arnoldwender/fix/kg-temporal-inversion-guard
fix(kg): reject inverted intervals in add_triple (valid_to < valid_from)
2026-05-06 01:32:16 -03:00
fatkobra 6b042982e8 fix(repair): preflight SQLite integrity before rebuild 2026-05-05 15:51:43 +00:00
fatkobra bb40a529fd fix(migrate): verify write roundtrip before bailout 2026-05-05 09:01:05 +00:00
fatkobra 37e7d394b8 fix(repair): preflight poisoned max_seq_id 2026-05-05 07:22:10 +00:00
fatkobra eff844b168 fix(storage): quarantine partial HNSW flush without metadata 2026-05-04 10:34:22 +00:00
Arnold Wender 2e441d17a2 fix(entity_registry): fsync parent dir after rename for ext4 durability
Without this, on ext4 (and similar) filesystems the rename ack does not
guarantee durability across power loss — a crash can revert to a state
where the temp file is present and the target is at the old version.

Suggested by @jphein on #1215.
2026-05-04 11:08:14 +02:00
Arnold Wender 4f36145c2e fix(entity_registry): atomic write to prevent partial corruption on crash
EntityRegistry.save() called Path.write_text() directly, which truncates
the target file and then writes — so a crash mid-write (power loss, OOM,
filesystem-full mid-flush) leaves an empty or half-written
entity_registry.json. The whole people/projects map is lost; the system
falls back to an empty registry on next load.

Switch to the standard atomic-write pattern: serialize to a sibling
.tmp file in the same directory (so os.replace stays on one filesystem),
fsync, chmod 0o600, then os.replace over the target. The replace is
atomic on POSIX and Windows, so any crash leaves the previous registry
intact instead of a truncated file.

Tests cover: no leftover .tmp on success, and previous content preserved
when os.replace itself raises mid-save.
2026-05-04 11:08:14 +02:00
fatkobra f9d939ae1b fix(storage): quarantine bloated HNSW link payloads 2026-05-04 06:45:29 +00:00
mvalentsev 285b3b4f2e refactor(stdio): extract Windows UTF-8 reconfigure into shared helper
Both cli.py and fact_checker.py carried identical 28-line Windows stdio
reconfigure helpers; pull the loop into mempalace/_stdio.py so the same
machine drives the CLI, the fact_checker --stdin entry point, and the
MCP server. The thin per-call-site wrappers stay so existing tests keep
importing _reconfigure_stdio_utf8_on_windows from the same module they
always have.

CLI / fact_checker policy unchanged: stdin=surrogateescape (don't crash
on a malformed redirected file), stdout/stderr=replace (don't crash
mid-print on a surrogate half round-tripped from a filename).
2026-05-03 22:25:31 +05:00
mvalentsev 75ad8ae781 ci: retrigger linux 3.13 (transient onnx download flake) 2026-05-03 22:04:22 +05:00
mvalentsev b8816e0fe2 fix(mcp): retry KG handlers once on concurrent close race
Race scenario: a KG tool handler calls _get_kg() and gets a live
KnowledgeGraph; another thread fires tool_reconnect() between that
return and the handler's kg.add_triple()/kg.query_entity()/etc call.
tool_reconnect drains _kg_by_path and closes the underlying
sqlite3.Connection; the handler then raises sqlite3.ProgrammingError:
'Cannot operate on a closed database', which surfaces as a -32000
to the MCP client even though the user just asked for a reconnect.

New _call_kg(op) helper wraps each handler's kg call in a one-shot
retry: catch exactly sqlite3.ProgrammingError, evict the stale entry
(only if the cache slot still points at the closed instance — another
thread may have already replaced it), and rerun op against a fresh
_get_kg(). Beyond one retry give up so a sustained close-stream
surfaces clearly instead of looping.

All five KG handlers (tool_kg_query, tool_kg_add, tool_kg_invalidate,
tool_kg_timeline, tool_kg_stats) now route through _call_kg.

Tests pin the contract:
  * retries with a fresh KG and returns the second result
  * non-ProgrammingError exceptions propagate without retry
  * gives up after exactly one retry on sustained close
2026-05-03 21:43:51 +05:00
mvalentsev 03643eb507 fix(cli, fact-checker): per-stream stdio errors policy on Windows
Previously all three streams reconfigured to UTF-8 with errors='strict'.
That kills 'mempalace search' the moment a drawer carrying a surrogate
half (round-tripped from a filename via surrogateescape) hits print(),
losing the rest of the result block. Same hazard for warning lines on
stderr.

Split the policy:
  stdin  -> surrogateescape (malformed bytes from a redirected file
            survive as lone surrogates instead of crashing the read)
  stdout -> replace (drawer text with a stray surrogate becomes U+FFFD
            instead of UnicodeEncodeError mid-print)
  stderr -> replace (same protection for logger / warning paths)

Applied identically in the cli.py and fact_checker.py helpers; the DRY
extraction into a shared module is a separate cleanup ask, kept out of
this fix to keep the diff narrow.

Tests updated for the new per-stream assertion.
2026-05-03 21:37:12 +05:00
mvalentsev 32f4dfa26d fix(cli): reconfigure stdio to UTF-8 on Windows
The primary `mempalace` console_script (`cli.py:main()`) reads non-ASCII
arguments via piped stdin and writes verbatim drawer text / wing names
through `print()`. On Windows, Python defaults stdio to the system ANSI
codepage (cp1252/cp1251/cp950), so:

- `mempalace search "..." > out.txt` mojibakes any drawer text containing
  non-Latin characters
- `mempalace ... < input.txt` mojibakes piped non-ASCII input

Reconfigure stdin/stdout/stderr to UTF-8 (`errors="strict"`) at the top
of `main()`, mirroring the helper added in this PR for fact_checker's
`__main__` block. Wrapped in try/except so a replaced stream (Jupyter,
test harness) logs a warning and continues rather than crashing the CLI.

The reconfigure cascades through every `mempalace` subcommand
(`init`/`mine`/`search`/`status`/`hook`/etc.) and through the interactive
flows that read non-ASCII names via `input()` (onboarding, entity
detector, room detector). With this commit the package's three
user-facing entry points (`mempalace`, `mempalace-mcp`, and
`python -m mempalace.fact_checker`) all reconfigure stdio identically on
Windows.
2026-05-03 21:33:54 +05:00
mvalentsev 7cee74c8c8 fix(fact-checker): reconfigure stdio to UTF-8 on Windows
The `python -m mempalace.fact_checker --stdin` entry point reads non-ASCII
text through the system ANSI codepage (cp1252/cp1251/cp950) on Windows,
which mojibakes characters before claim-extraction sees them. Reconfigure
stdin/stdout/stderr to UTF-8 with `errors="strict"`, wrapped in try/except
so a replaced stream (Jupyter, test harness) logs a warning rather than
crashing the CLI.

Mirrors the same fix shipped for `mcp_server.py:main()` (#400) and
`hooks_cli.py:run_hook()` (#1280) -- this is the third and last
stdin-reading entry point in the package.
2026-05-03 21:33:54 +05:00
mvalentsev 45df1a2657 fix(backends/chroma): release SQLite file lock on close_palace/close (#1067)
ChromaBackend.close_palace() and close() evicted cached PersistentClients
from self._clients without calling client.close(), so chromadb 1.5.x kept
the rust-side SQLite file lock until GC. Reopening the same palace path
after shutil.rmtree + re-create within one process then failed with
SQLITE_READONLY_DBMOVED (SQLite code 1032).

Add _close_client() helper with a try/except fallback for older chromadb,
and route close_palace(), close(), and the DB-file-missing invalidation
branch of _client() through it. The mtime/inode auto-invalidation branch
is left as-is: callers there may still hold a live ChromaCollection
handle, and closing out from under them clears the rust bindings mid-use.

Regression tests cover close_palace reopen-same-path and whole-backend
close for multiple palaces.
2026-05-03 19:16:25 +05:00
mvalentsev 0a62658051 fix(mcp): drain KG cache on tool_reconnect
tool_reconnect cleared ChromaDB caches but left _kg_by_path entries
intact. After an external replacement of knowledge_graph.sqlite3 the
server kept serving the old open sqlite3.Connection, returning stale
results.

Now iterate _kg_by_path under _kg_cache_lock, call close() best-effort,
and clear the dict so the next tool call reopens the KG from disk.
Two new tests in TestKGLazyCache verify cache invalidation and that a
failing close() does not block the clear.
2026-05-03 17:43:00 +05:00
mvalentsev 19f8a4ff68 style(mcp): drop issue-tracker comments from KG cache block
Inline comments referencing #1136 and #540 add no information the
identifiers do not already convey. PR description carries the context;
code stays quiet.
2026-05-03 17:43:00 +05:00
mvalentsev 84f9726a39 test(mcp): fix Windows subprocess env in KG lazy-init test
Passing a stripped env dict without SYSTEMROOT/WINDIR breaks Python
bootstrap on Windows (_Py_HashRandomization_Init). Inherit the parent
env and strip MEMPAL* vars instead, then override HOME/USERPROFILE to
the tmp dir.
2026-05-03 17:43:00 +05:00
mvalentsev c69a622a18 test(mcp): add multi-tenant and lazy-init tests for KG (#1136)
TestKGLazyCache covers the scenarios behind the lazy per-path refactor:

- test_lazy_init_no_import_side_effect: a fresh subprocess import does
  not create ~/.mempalace/knowledge_graph.sqlite3 (what closed PR #167
  was aiming at).
- test_get_kg_returns_same_instance: two _get_kg() calls under the same
  resolved path return the same object, cache has one entry.
- test_get_kg_different_paths_different_instances: rotating env var
  produces distinct KGs.
- test_multi_tenant_env_switch: the exact scenario from #1136 — write
  under path A, query under path B returns empty, switching back to A
  sees the fact.
- test_cache_thread_safe: 16 threads racing _get_kg() end up with one
  shared instance and one cache entry.
2026-05-03 17:43:00 +05:00
mvalentsev 9e730098e9 test(mcp): migrate _kg monkeypatches to _get_kg (#1136)
Direct module-attribute patching of _kg is obsolete after the lazy
cache refactor. Switch test helpers to patch _get_kg instead so the
fixture KG replaces the factory rather than a now-missing singleton.

- tests/test_mcp_server.py: _patch_mcp_server helper
- tests/benchmarks/test_mcp_bench.py: _patch_mcp_config helper
- tests/benchmarks/test_memory_profile.py: inline patch in test_tool_status_repeated_calls
2026-05-03 17:43:00 +05:00
mvalentsev beac5d9954 refactor(mcp): replace eager _kg with lazy per-path cache (#1136)
Swap the module-level KnowledgeGraph singleton for a lazy, per-path
cache keyed by the resolved sqlite path. Import no longer creates a
sqlite file as a side effect, and MCP servers started with --palace
now route KG calls to the correct tenant when MEMPALACE_PALACE_PATH
changes between calls, matching the per-call behavior of _get_client()
on the ChromaDB side.

Default-path behavior is preserved: without --palace at startup, KG
stays on DEFAULT_KG_PATH regardless of env var. The "no --palace but
env var set" case is #540's scope and is not changed here.
2026-05-03 17:43:00 +05:00
Igor Lins e Silva 1888b671e2 Merge pull request #1321 from MemPalace/fix/1313-init-palace-flag
fix(cli): honor --palace flag in cmd_init (#1313)
2026-05-03 03:54:06 -03:00