Commit Graph

16 Commits

Author SHA1 Message Date
Igor Lins e Silva e9aee19433 fix(tests): apply ruff format after rebase resolution
The collection_name plumbing rebase produced a few unformatted blocks
in test_mcp_server.py and test_searcher.py; bringing them in line with
the 0.4.x CI pin so test-windows / lint stay green.
2026-05-07 09:10:22 -03:00
Mika Cohen ec6d2dde01 fix: use configured collection in recovery paths 2026-05-07 09:10:00 -03:00
Igor Lins e Silva aac8437979 style: ruff format tests/test_searcher.py (CI lint) 2026-05-06 02:19:54 -03:00
eldar702 5347c2c71c fix(searcher): clamp effective_distance to valid cosine range [0, 2]
``search_memories`` computes ``effective_dist = dist - boost`` where
``boost`` can be as large as ``CLOSET_RANK_BOOSTS[0] == 0.40`` for a
rank-0 closet hit. When the raw drawer distance is small — any
near-exact match — the subtraction goes negative.

Two downstream effects:

1. Line 418 returns ``round(max(0.0, 1 - effective_dist), 3)`` as
   ``similarity``. With ``effective_dist = -0.30`` that yields
   ``similarity = 1.30``, outside the documented ``[0, 1]`` range.
   The ``max(0.0, ...)`` only prevents negative similarities; it does
   not cap above 1.
2. Line 427 stores ``_sort_key: effective_dist`` and line 435 sorts
   ``scored`` ascending by that key. A negative key drops *below* the
   rest, so the strongest hybrid matches end up sorting after weaker
   ones — ranking inversion under the exact conditions hybrid retrieval
   is supposed to serve best.

Clamp ``effective_dist`` to the valid cosine-distance range ``[0, 2]``.
The boost still wins (closet-backed hit still ranks first), it just no
longer flips the order.

Test added: mock drawer_col (base dist 0.08 / 0.35 for two sources) +
closet_col (rank-0 closet for the 0.08 source) → assert all hits have
``0 <= similarity <= 1`` and ``0 <= effective_distance <= 2``, and that
the closet-boosted source still ranks first.

Relationship to other PRs:

* **#988** clamps the output ``similarity`` alone. That does not fix
  the sort-key inversion or the invalid ``effective_distance`` in the
  returned dict. This PR clamps at the arithmetic source so both
  downstream users of the value stay in range.
* Orthogonal to **#979** (``tool_check_duplicate`` negative similarity).
2026-05-06 02:19:54 -03:00
jp 67248330c5 chore: ruff format tests/test_searcher.py
CI lint job runs `ruff format --check`; the new tests in TestBM25NoneSafety
needed the standard "blank line after import-inside-function" + line-length
wrap. No logic change — formatter pass only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 07:22:53 -07:00
jp ee12c07c54 fix(searcher): tolerate None documents in BM25 reranker
`_tokenize` calls `text.lower()` unconditionally; when ChromaDB returns a
drawer with `documents` containing `None`, the hybrid-rerank path raises
`AttributeError: 'NoneType' object has no attribute 'lower'`.

Observed in production daemon log (2026-04-24 21:07:05) during a search
that triggered `_hybrid_rank → _bm25_scores → _tokenize`:

    File "mempalace/searcher.py", line 81, in _bm25_scores
        tokenized = [_tokenize(d) for d in documents]
    File "mempalace/searcher.py", line 52, in _tokenize
        return _TOKEN_RE.findall(text.lower())
    AttributeError: 'NoneType' object has no attribute 'lower'

Closes the gap left by the upstream None-metadata audit (#999), which
covered metadata loops but not BM25 helpers. Returns `[]` for falsy input
so a None doc gets score 0.0 while the rest of the corpus reranks normally.

Three regression tests in TestBM25NoneSafety lock the behavior and reference
the production trace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 06:27:14 -07:00
Igor Lins e Silva 133dfbfb41 fix(search): BM25 hybrid rerank, legacy-metric warning, invariant tests
Three tightly-coupled search-quality fixes for v3.3.3:

1. CLI `mempalace search` now routes through the same `_hybrid_rank`
   the MCP path already used. Drawers whose text contains every query
   term but embed as file-tree noise (directory listings, diffs, log
   fragments) were scoring cosine distance >= 1.0 — the display formula
   `max(0, 1 - dist)` then floored every result to `Match: 0.0`, with
   no way for the user to tell a lexical match from a total miss. BM25
   catches these cleanly; the display surfaces both `cosine=` and
   `bm25=` so users see which component is firing.

2. Legacy-palace distance-metric warning. Palaces created before
   `hnsw:space=cosine` was consistently set silently use ChromaDB's
   default L2 metric, which breaks the cosine-similarity formula (L2
   distances routinely exceed 1.0 on normalized 384-dim vectors). The
   search path now detects this at query time and prints a one-line
   notice pointing at `mempalace repair`. Only fires for legacy
   palaces; new palaces already set cosine correctly.

3. Invariant tests pinning `hnsw:space=cosine` on every collection-
   creation path — legacy `get_or_create_collection`, legacy
   `create_collection`, RFC 001 `get_collection(create=True)`, the
   public `palace.get_collection`, and a round-trip through reopen.
   Locks down the correctness that new-user palaces already have so a
   future refactor can't silently regress it.

Also adds a `metadata` property to `ChromaCollection` so callers can
read the underlying hnsw:space without reaching into `_collection`.

Tests:
- New regression: simulate three candidates at distance 1.5 (cosine=0),
  one containing query terms — must rank first with non-zero bm25.
- New: legacy metric (empty or non-cosine) produces stderr warning.
- New: correctly-configured palace produces no warning.
- New: all five creation paths pin cosine metadata.

All existing tests still pass.
2026-04-25 00:39:37 -03:00
jp 7690574dde fix(searcher): guard API path + closet loop against None metadata too
Per Copilot review on the CLI-only PR (#999): search_memories() has the
same vulnerability in two additional spots, since ChromaDB can return
None entries in the inner metadatas list for either the drawer query or
the closets query. Without guards, the API path crashes with:

    AttributeError: 'NoneType' object has no attribute 'get'

at either \`cmeta.get("source_file", "")\` in the closet boost lookup or
\`meta.get("source_file", "") or ""\` in the drawer scoring loop.

Applies the matching \`meta = meta or {}\` / \`cmeta = cmeta or {}\`
guard at both sites and adds an API-path regression test that mocks a
drawer query result with a None metadata entry and asserts both hits
render — the None-metadata hit with the existing \`"unknown"\` sentinel
values the scoring loop already writes for missing keys.

Verified both the new API test and the existing CLI test fail without
the guards (AttributeError) and pass with them.
2026-04-18 10:37:05 -07:00
jp a3c778210b fix(searcher): guard against None metadata in CLI print path
`col.query(...)` can return `None` entries in the inner ``metadatas`` list
for drawers whose metadata was never set (older palaces, rows written
outside the normal mining path). The CLI `search()` function would render
earlier results successfully and then crash mid-loop with:

    AttributeError: 'NoneType' object has no attribute 'get'

at ``searcher.py:286`` — ``meta.get("source_file", "?")``. The user sees
partial output followed by a traceback, with no indication of which
drawers rendered OK and which were skipped.

Guard with ``meta = meta or {}`` inside the loop so entries with missing
metadata fall back to the existing ``"?"`` defaults instead of crashing,
matching the hit dict assembly in ``search_memories()`` which already
uses ``meta.get("wing", "unknown")`` etc. against the same data.

Adds a regression test that mocks a ChromaDB result with a ``None``
metadata entry in the middle of the inner list and asserts both result
blocks render to stdout.
2026-04-18 10:00:59 -07:00
sha2fiddy a15094ce60 feat: include created_at timestamp in search results (#846)
* feat: include created_at timestamp in search results (closes #465)

Surface the existing filed_at metadata as created_at in search result
objects returned by search_memories(). Enables temporal reasoning over
search hits without additional queries.

* Feat: add fallback for missing filed_at metadata
2026-04-15 00:26:57 -07:00
Sergey Kuznetsov ae5196bc8d Мempalace backend seam (#413)
* refactor: add stage-1 backend abstraction seam

Introduce the first upstreamable storage seam for MemPalace without
bringing in the PostgreSQL spike or any benchmark artifacts.

This change adds a small backend package with:
- BaseCollection as the minimal collection contract
- ChromaBackend/ChromaCollection as the default implementation

It then routes the main runtime collection consumers through that seam:
- palace.py
- searcher.py
- layers.py
- palace_graph.py
- mcp_server.py
- miner.status()

Behavioral constraints kept for stage 1:
- ChromaDB remains the only backend and the default path
- no config/env backend selection yet
- no PostgreSQL code
- no benchmark or research files
- existing tests stay unchanged

Important compatibility details:
- read paths now call the seam with create=False so they still surface
  the existing 'no palace found' behavior instead of silently creating
  empty collections
- write paths keep create=True semantics through palace.get_collection()
- layers/searcher retain a chromadb module attribute so the existing
  mock-based tests can keep patching PersistentClient unchanged
- ChromaBackend only creates palace directories on create=True, which
  preserves mocked read-path tests that use fake read-only paths

Verification:
- python3 -m py_compile mempalace/backends/__init__.py mempalace/backends/base.py mempalace/backends/chroma.py mempalace/palace.py mempalace/searcher.py mempalace/layers.py mempalace/palace_graph.py mempalace/mcp_server.py mempalace/miner.py
- pytest -q  # 529 passed, 106 deselected

* refactor: clean up stage-1 seam compatibility shims

Tighten the stage-1 backend abstraction branch after review.

This follow-up does three small things:
- keep the chromadb compatibility hook in searcher.py and layers.py,
  but express it through the backends.chroma module so it no longer
  reads like an accidental unused import
- fix the palace_graph.py helper alias to avoid the local name collision
  flagged by ruff (imported helper vs local _get_collection wrapper)
- preserve the existing mock-based test patch points unchanged while
  keeping the new backend seam intact

Why this matters:
- the direct  form looked like a
  dead import in review, even though it was intentionally preserving the
  existing test seam ( and
  )
- palace_graph.py had a real lint issue ( redefinition) that was
  small but worth fixing before a public PR

Verification:
- /opt/homebrew/bin/ruff check mempalace/backends/__init__.py mempalace/backends/base.py mempalace/backends/chroma.py mempalace/palace.py mempalace/searcher.py mempalace/layers.py mempalace/palace_graph.py mempalace/mcp_server.py mempalace/miner.py
- pytest -q tests/test_layers.py tests/test_searcher.py
- pytest -q  # 529 passed, 106 deselected

* docs: explain backend shim imports in search paths

Add short code comments in searcher.py and layers.py explaining why the
module-level `chromadb` alias remains after the stage-1 backend seam
refactor.

The alias is intentional: it preserves the existing mock patch points used
by the current test suite (`mempalace.searcher.chromadb.PersistentClient`
and `mempalace.layers.chromadb.PersistentClient`) while the runtime logic
now flows through the backend abstraction.

This keeps the public PR easier to review because the apparent "unused
import" now has an explicit reason next to it.

Verification:
- /opt/homebrew/bin/ruff check mempalace/searcher.py mempalace/layers.py
- pytest -q tests/test_layers.py tests/test_searcher.py

* refactor: reuse a default backend instance in palace helper

Tighten the stage-1 backend seam by promoting the default Chroma backend
adapter to a module-level singleton in `mempalace/palace.py`.

This keeps the stage-1 scope unchanged — Chroma is still the only backend
wired in this branch — but avoids constructing a fresh `ChromaBackend()`
object on every `get_collection()` call. The backend is stateless today,
so this is a readability/cleanup change rather than a behavioral one.

Why this helps:
- makes `palace.get_collection()` read like a real default factory instead
  of an inline constructor call
- keeps the stage-1 branch a little cleaner before opening the public PR
- does not widen the backend surface or change any config/runtime behavior

Verification:
- python3 -m py_compile mempalace/palace.py
- pytest -q tests/test_miner.py tests/test_layers.py tests/test_searcher.py
- pytest -q  # 529 passed, 106 deselected

* fix: harden read-only seam behavior and update seam tests

Preserve the stage-1 backend abstraction while closing the real read-path
regression surfaced in PR review.

What changed:
- make ChromaBackend.get_collection(create=False) fail fast when the palace
  directory does not exist instead of letting PersistentClient create it as a
  side effect
- update miner.status() to call get_collection(..., create=False) so status
  keeps the historical 'No palace found' behavior
- remove the temporary chromadb shim aliases from layers.py and searcher.py
  now that the tests patch the seam directly
- add focused tests for the new backends package, including ChromaCollection
  delegation and ChromaBackend create=True/create=False behavior
- retarget layer/searcher tests to patch the backend seam instead of patching
  chromadb.PersistentClient inside production modules
- add a regression test that status() does not create an empty palace when the
  target path is missing

Verification:
- ruff check .
- uv run pytest -q
- uv run pytest -q tests/test_backends.py tests/test_cli.py tests/test_mcp_server.py tests/test_layers.py tests/test_searcher.py tests/test_miner.py

Notes:
- the separate benchmark/slow/stress layer was started as a soak but not used
  as the merge gate for this PR branch

* refactor: drop duplicate mcp collection cache declaration

Remove a redundant `_collection_cache = None` assignment in
`mempalace/mcp_server.py` left over after the stage-1 backend seam refactor.

This does not change behavior; it only trims review noise in the MCP server
module after the read-path hardening pass.

Verification:
- ruff check mempalace/mcp_server.py
- uv run pytest -q tests/test_mcp_server.py

---------

Co-authored-by: Sergey Kuznetsov <sergey@iterudit.com>
2026-04-11 16:16:49 -07:00
Tal Muskal e24d8ca733 test: expand coverage to 70%, fix mcp_server CI crash (threshold 60%)
Add/expand tests for normalize (39%→97%), searcher (39%→100%),
layers (28%→97%), split_mega_files (34%→72%).

Fix mcp_server.py parse_args→parse_known_args to prevent SystemExit
when imported during pytest (CI was crashing on all test jobs).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 21:07:03 +03:00
Igor Lins e Silva 47696bef8c fix: address Copilot review — derive MCP version, improve test isolation and portability 2026-04-08 04:41:03 -03:00
Igor Lins e Silva 96de23cd97 fix: CI failures — update workflow for uv migration, fix lint and format
- Switch CI install step from `pip install -r requirements.txt` to
  `pip install -e ".[dev]"` since requirements.txt was removed
- Add noqa: E402 to intentionally-late imports in conftest.py
  (HOME must be isolated before mempalace imports)
- Remove unused KnowledgeGraph import in test_knowledge_graph.py
- Apply ruff formatting to test files
2026-04-07 17:59:21 -03:00
Igor Lins e Silva cd8b245fdc fix: address Copilot review — remove unused imports, isolate HOME in tests, restore dev extra 2026-04-07 17:55:10 -03:00
Igor Lins e Silva 72c548b729 test: expand coverage from 20 to 92 tests, migrate to uv
- Migrate from setuptools to hatchling build backend
- Add dependency-groups (PEP 735) for dev tooling (pytest, ruff)
- Remove redundant requirements.txt in favor of uv.lock
- Fix __version__ mismatch (2.0.0 -> 3.0.0 to match pyproject.toml)

New test files:
- conftest.py: shared fixtures (isolated palace, KG, ChromaDB collection)
- test_knowledge_graph.py: 17 tests (entity CRUD, temporal queries, timeline)
- test_mcp_server.py: 25 tests (protocol dispatch, read/write/KG/diary tools)
- test_searcher.py: 7 tests (search_memories API, filters, error handling)
- test_dialect.py: 13 tests (AAAK compression, entity/emotion detection, zettel encoding)

All 92 tests pass on Python 3.13 with chromadb 0.6.3.
2026-04-07 17:55:10 -03:00