fix(mcp_server): pass embedding_function= on collection reopen (#1299)

`mcp_server._get_collection` bypassed `ChromaBackend.get_collection`
and called `client.get_collection` / `client.create_collection` without
`embedding_function=`. ChromaDB 1.x does not persist the EF identity
with the collection, so the MCP server's reopen silently bound
chromadb's built-in `DefaultEmbeddingFunction` while the miner / Stop
hook ingest path bound `mempalace.embedding.get_embedding_function()`.

On bleeding-edge interpreters (python 3.14 + chromadb 1.5.x on Apple
Silicon, per #1299) the default EF's lazy ONNX provider selection could
SIGSEGV the host process on first `col.add()`, killing the MCP stdio
server and leaving every subsequent tool call returning
`Connection closed` until Claude Code was relaunched. Reads worked
because `col.get(ids=...)` and metadata fetches don't invoke the EF;
the auto-ingest path worked because mining routes through the backend
abstraction. Diary writes were the consistent failure surface.

Resolve the EF up front (matching `ChromaBackend._resolve_embedding_function`)
and pass it into both reopen branches. Falls back to the chromadb default
only if `mempalace.embedding.get_embedding_function` itself raises.

Regression test patches the chromadb client class to capture
`embedding_function=` on every `get_collection` / `create_collection`
call from `_get_collection(create=True)` and `_get_collection()`, and
fails if any call omits it.

Follow-up to #1262 / #1289 (which fixed the metadata-mismatch SIGSEGV
path); this addresses the EF-mismatch SIGSEGV path on the same surface.
This commit is contained in:
Igor Lins e Silva
2026-05-01 19:34:38 -03:00
parent 5e540da06b
commit ac6c2b6af6
3 changed files with 77 additions and 2 deletions
+1
View File
@@ -19,6 +19,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
### Bug Fixes
- **MCP server `tool_diary_write` SIGSEGV when EF default differs.** `mcp_server._get_collection` bypassed `ChromaBackend.get_collection` and called `client.get_collection` / `client.create_collection` without `embedding_function=`. ChromaDB 1.x does not persist the EF identity with the collection, so the MCP server's reopen silently bound chromadb's built-in `DefaultEmbeddingFunction` to the collection while the miner / Stop hook ingest path bound `mempalace.embedding.get_embedding_function()`. On bleeding-edge interpreters (python 3.14 + chromadb 1.5.x on Apple Silicon) the default's lazy ONNX provider selection could SIGSEGV the host process on first `col.add()`, killing the MCP stdio server and leaving every subsequent tool call returning `Connection closed` until Claude Code was relaunched. `_get_collection` now resolves the EF via `mempalace.embedding.get_embedding_function` and passes it into both reopen branches, matching the miner/backend path. (#1299, follow-up to #1262 / #1289)
- **Cross-wing topic tunnels for hyphenated dir names.** `mempalace init` recorded the `topics_by_wing` registry key under the raw directory name (e.g. `mempalace-public`), while `mempalace.yaml`'s `wing` field used the lower-cased + separator-collapsed slug (`mempalace_public`). At mine time the miner read the slug from the yaml and missed the registry, so `_compute_topic_tunnels_for_wing` returned `0` silently. Real-world: any project whose folder contained a hyphen or space lost every topic tunnel. Now both call sites route through a shared `normalize_wing_name()` in `config.py`. (#1194, follow-up to #1180)
- **CLI `mempalace search` retrieval quality.** The CLI was using pure ChromaDB cosine distance with no BM25 rerank, so drawers containing every query term but embedding as noise (directory listings, diff output, shell logs) scored `Match: 0.0` alongside genuinely irrelevant results with no way to tell them apart. Wired the CLI through the same `_hybrid_rank` the `mempalace_search` MCP tool already used, and surfaced both `cosine=` and `bm25=` scores in the output so users see which component of the match is firing. MCP search was unaffected; this fixes the human-facing CLI parity gap.
- **Legacy-palace distance-metric warning.** CLI search now detects palaces created before `hnsw:space=cosine` was consistently set and prints a one-line notice pointing at `mempalace repair`. Without the warning such palaces silently used L2 distance, under which the similarity display floored every result to `Match: 0.0`. New palaces mined today already set cosine correctly and now have invariant tests pinning that behavior so future refactors can't silently regress it. (#1179)