Commit Graph

803 Commits

Author SHA1 Message Date
Igor Lins e Silva 57ac669dbc fix(repair): address Copilot review on #1227
Five Copilot review issues + the Python 3.9 CI failure rolled into one
follow-up:

* Replace ``dict | None`` annotated assignment with a type-comment so
  module load doesn't evaluate PEP 604 syntax on Python 3.9 (CI red).
* Drop ``mempalace repair rebuild`` — the CLI only ships ``mempalace
  repair`` (rebuild) and ``mempalace repair-status``. Updated all
  user-facing messages, docstrings, and test assertions.
* Replace ``_get_client()`` in ``tool_search`` with the safe
  ``_refresh_vector_disabled_flag`` probe so the fallback isn't
  defeated by the very chromadb client load it's trying to avoid.
* Short-circuit ``tool_status`` to a pure-sqlite reader
  (``_tool_status_via_sqlite``) when divergence is detected so wing /
  room counts come back without ever opening the persistent client.
* Wrap the recency-window query in ``_bm25_only_via_sqlite`` with an
  ``id``-ordered fallback so legacy schemas missing ``created_at``
  don't break BM25 search.

New test covers the sqlite-status short-circuit. 1409 tests pass.
2026-04-26 21:53:56 -03:00
Igor Lins e Silva 0d349c3d86 fix(repair): detect HNSW capacity divergence and fall back to BM25 (#1222)
When chromadb's HNSW segment freezes at a stale max_elements while
sqlite keeps accumulating embeddings, the next chromadb open segfaults
the MCP server on every tool call. Adds a pure-filesystem capacity probe
(zero chromadb interaction), a `mempalace repair-status` read-only
health check, and a BM25-only sqlite fallback so the palace stays
reachable even when vector search is unavailable.

* `hnsw_capacity_status` reads sqlite + index_metadata.pickle directly
  via a tight-allowlist unpickler — no hnswlib import, no segment load.
* MCP server runs the probe at startup and after every reconnect; sets
  `_vector_disabled` and routes search to the sqlite FTS5 + BM25 path.
* `tool_status` and `tool_reconnect` surface the fallback state.
* Threshold tuned for chromadb 1.5.x async-flush lag (2× sync_threshold).
2026-04-26 19:54:00 -03:00
Igor Lins e Silva 899a5ec4c6 Merge pull request #1225 from MemPalace/feat/privacy-warn-tailscale-cgnat
feat(privacy): treat Tailscale CGNAT range (100.64.0.0/10) as local
2026-04-26 19:40:58 -03:00
Igor Lins e Silva 22f3d9be30 Merge pull request #1223 from MemPalace/chore/corpus-origin-merge-followup
chore(corpus-origin): tag merged evidence by tier + pin confidence-source contract
2026-04-26 19:39:01 -03:00
MSL a0b7ba005d feat(privacy): treat Tailscale CGNAT range (100.64.0.0/10) as local
2 files changed, 60 insertions, 0 deletions. 2 new tests (RED-first).

Follow-up to #1224's privacy warning. The URL-based heuristic in
``mempalace.llm_client._endpoint_is_local`` shipped without recognizing
Tailscale's CGNAT range (100.64.0.0/10), so a user running LM Studio,
Ollama, or any local LLM accessible via a Tailscale-assigned 100.x.x.x
address would currently get a wrong privacy warning — Tailscale
addresses are network-private (only reachable inside the user's
Tailnet) but they're not RFC1918, so the heuristic was treating them
as external.

This PR adds CGNAT recognition: when the hostname starts with ``100.``
AND the second octet is between 64 and 127 inclusive, it's classified
as local. Addresses in 100.x.x.x outside that range (i.e. second octet
< 64 or > 127) are regular allocated public space and remain external,
so a user pointing at a public 100.0.0.1 still gets the warning.

Concrete user impact:

  Before: ``mempalace init --llm-provider openai-compat --llm-endpoint http://100.100.50.50:1234``
          (LM Studio on Tailnet) → triggers privacy warning incorrectly.

  After:  same command → no warning. data stays inside the user's
          Tailnet, which is what the warning is supposed to protect against.

TDD: 2 tests added in ``tests/test_llm_client.py``, both RED-first.

1. ``test_openai_compat_provider_tailscale_cgnat_endpoint_is_local``
   — covers three Tailscale CGNAT addresses (start, middle, near-end
   of the range) and pins they're all classified local. This was the
   RED that drove the implementation.

2. ``test_openai_compat_provider_outside_tailscale_cgnat_is_external``
   — pins the boundary on both sides: addresses with second octet 0-63
   and 128-255 stay external. Prevents future "treat all 100.x.x.x as
   local" overcorrection.

Tests: 1388 total mempalace tests pass. 2 pre-existing environmental
failures unrelated to this change (chromadb optional dep). Ruff check
+ format both clean.

Backwards compatible: only widens the local-recognition set. Anything
classified local before is still classified local; anything classified
external before remains so unless it's specifically in the CGNAT range.

Out of scope (tracked for future iteration based on real user feedback,
not built speculatively): pre-init confirmation prompt before sending
to external API, persistent ``private-only`` config flag that refuses
external endpoints entirely, explicit cloud-provider name detection
("Using Anthropic's hosted API at ..." vs the current generic warning).
2026-04-26 15:31:44 -07:00
Igor Lins e Silva 5e33592ba2 chore(corpus-origin): address Copilot review on #1223
- cli.py: stringify each evidence entry exactly once before the
  startswith check (was calling str(e) twice per element).
- tests: replace brittle `confidence != 0.90` assertion with an
  equality check against detect_origin_heuristic on the same samples.
  The original would have spuriously fired if the heuristic ever
  legitimately produced 0.90 for these samples; the new form pins the
  contract directly.
2026-04-26 19:18:57 -03:00
Igor Lins e Silva c92256f08f chore(corpus-origin): tag merged evidence by tier + pin confidence-source contract
Two follow-ups to PR #1221's merge-fields behavior, both raised by the
Copilot review on that PR:

- Evidence merge now prefixes each entry with `Tier-1 heuristic: ` or
  `Tier-2 LLM: ` so the on-disk `origin.json` audit record retains tier
  provenance. The pre-#1221 code labeled heuristic evidence; the
  merge-fields refactor flattened that. Re-prefixing is idempotent.

- Tests now assert that the merged `confidence` is the heuristic's, not
  the LLM's. Added inline assertions to the two existing
  contradiction/disagreement tests, plus a dedicated
  `test_merge_tier_fields_confidence_matches_heuristic_call` that
  compares to `detect_origin_heuristic` directly so a future regression
  letting Tier 2 confidence leak through cannot pass silently.

Tests: 1378 pass. Ruff check + format both clean (CI-pinned 0.4.x).
2026-04-26 19:18:57 -03:00
Igor Lins e Silva 4d33f14988 Merge pull request #1224 from MemPalace/feat/privacy-warn-external-llm
feat(privacy): warn when LLM tier sends content to external API
2026-04-26 19:18:13 -03:00
MSL 4400734867 feat(privacy): warn when LLM tier sends content to external API
4 files changed, 248 insertions, 0 deletions. 7 new tests (4 unit + 3 integration), all RED-first.

Per @milla-jovovich's question to @igorls during PR #1221 review: users
running `mempalace init` with an external LLM provider (Anthropic API,
OpenAI hosted, etc.) need a clear, explicit warning that their folder
content will be sent to the provider, that MemPalace doesn't control
how the provider logs/retains/uses that data, and how to opt out.
@igorls confirmed this should be a small follow-up PR scoped to the
warning itself, before the v3.3.4 tag.

This PR adds:

- `_endpoint_is_local(url)` helper in `mempalace/llm_client.py` —
  URL-based heuristic returning True if the hostname is on the user's
  machine or private network. Covers: localhost, 127.0.0.1, ::1,
  hostnames ending in .local (mDNS/Bonjour), IPv4 RFC1918 ranges
  (10/8, 172.16-31/12, 192.168/16), and IPv6 unique-local addresses
  (fc00::/7).

- `is_external_service` property on the `LLMProvider` base class.
  Subclasses inherit; the URL determines (no provider-specific
  hardcoding). This means: Ollama on localhost = local. LM Studio on
  LAN = local. Anthropic with default `https://api.anthropic.com` =
  external. A user proxying Anthropic through localhost (advanced
  setup) = local, no false-positive warning.

- One-line warning print in `cmd_init` after successful provider
  acquisition, gated on `is_external_service`:

      ⚠ {provider_name} is an EXTERNAL API. Your folder content will be
      sent to the provider during init. MemPalace does not control how
      the provider logs, retains, or uses your data. Pass --no-llm to
      keep init fully local.

  The warning fires AFTER `LLM enabled: ...` so users see both that
  the LLM is engaged AND the privacy implications of where it lives,
  before Pass 0 / entity detection actually runs.

LOCAL providers (Ollama on localhost, LM Studio on localhost or LAN,
llama.cpp on localhost, vLLM on localhost) DO NOT trigger the warning —
nothing leaves the user's machine/network in those configurations.

TDD: 7 tests added across 2 files.

Unit tests in `tests/test_llm_client.py` (4 tests, all RED-first):

1. test_ollama_provider_default_endpoint_is_local — pins that the
   default `http://localhost:11434` is classified local.
2. test_openai_compat_provider_localhost_endpoint_is_local — covers
   the LM Studio / llama.cpp / vLLM common case (localhost,
   127.0.0.1, and 192.168.x LAN).
3. test_openai_compat_provider_cloud_endpoint_is_external — pins
   that pointing openai-compat at https://api.openai.com (or any
   non-local URL) classifies as external.
4. test_anthropic_provider_default_endpoint_is_external — pins that
   AnthropicProvider's default endpoint is external (the dominant
   user-facing case for `--llm-provider anthropic`).

Integration tests in `tests/test_corpus_origin_integration.py` (3 tests,
RED-first; 1 was the critical RED — the other 2 passed by accident
since nothing printed "EXTERNAL API" before this PR):

5. test_init_prints_privacy_warning_when_provider_is_external —
   captures stdout from cmd_init with a mocked external provider,
   asserts the warning text contains "EXTERNAL API" + "--no-llm" +
   language about MemPalace not controlling provider behavior.
6. test_init_no_privacy_warning_when_provider_is_local — same flow
   with a mocked local provider, asserts the warning text does NOT
   appear.
7. test_init_no_privacy_warning_with_no_llm_flag — pins the --no-llm
   path: no provider acquisition attempted, no warning fires.

Tests: 1382 total mempalace tests pass. 2 pre-existing environmental
failures unrelated to this change (chromadb optional dep). Ruff check +
format both clean.

Backwards compatible: `is_external_service` is a new property; existing
callers don't reference it. The warning is a new print statement that
fires only when an external endpoint is acquired. The `--no-llm` opt-out
existed before this PR and continues to work identically.

Out of scope for follow-up (deliberately not in this PR per Igor's
"small PR" guidance): Tailscale CGNAT (100.64.0.0/10) treatment,
pre-init confirmation prompt, persistent privacy-mode config flag,
explicit cloud-provider name detection. Tracked for future iteration.
2026-04-26 14:43:20 -07:00
Igor Lins e Silva 414aa3e20b Merge pull request #1198 from jphein/fix/tokenize-none-guard
fix(searcher): tolerate None documents in BM25 reranker
2026-04-26 18:41:07 -03:00
Igor Lins e Silva 30fc0ab865 Merge pull request #1201 from jphein/fix/palace-graph-none-metadata
fix(palace_graph): skip None metadata in build_graph
2026-04-26 18:37:52 -03:00
Igor Lins e Silva 025dd03047 Merge pull request #1177 from jphein/fix/blob-seq-marker-guard
fix: skip _fix_blob_seq_ids sqlite open on already-migrated palaces (#1090)
2026-04-26 18:28:55 -03:00
Igor Lins e Silva 0908786921 Merge pull request #1221 from MemPalace/feat/corpus-origin-merge-tier-fields
feat(corpus-origin): merge LLM fields into heuristic result instead of replacing
2026-04-26 18:27:51 -03:00
Igor Lins e Silva 8a1fd95b1b style: drop unused pathlib.Path imports in marker tests 2026-04-26 18:23:31 -03:00
Igor Lins e Silva 4aa93e82ad Merge pull request #1173 from jphein/fix/quarantine-on-make-client
fix: call quarantine_stale_hnsw() in make_client(); lower threshold to 5min
2026-04-26 18:22:14 -03:00
Igor Lins e Silva 43aa1aa24e style: ruff format under CI-pinned 0.4.x 2026-04-26 18:19:15 -03:00
MSL 7294b69ac8 feat(corpus-origin): merge LLM fields into heuristic result instead of replacing
2 files changed, 260 insertions, 7 deletions. 4 new tests (all RED-first).

Per @igorls's review of PR #1211 (https://github.com/MemPalace/mempalace/pull/1211#issuecomment-4322762236):
the corpus-origin Pass 0 currently lets a Tier 2 LLM result REPLACE the
heuristic result wholesale. With ``--llm`` default-on (since #1211) and a
small local model like Ollama gemma4:e4b, the LLM can return a wrong
``likely_ai_dialogue=False, confidence=0.90`` that overrides a confident
heuristic ``True``. Tier 2's persona/user/platform extraction is the whole
reason to run it; the YES/NO call should stay with the heuristic.

This PR changes ``_run_pass_zero`` in ``mempalace/cli.py`` to merge fields
instead of replacing:

  - ``likely_ai_dialogue``  → KEEP heuristic's (don't let weak LLM flip)
  - ``confidence``          → KEEP heuristic's (paired with the bool above)
  - ``primary_platform``    → TAKE LLM's when LLM provides one
  - ``user_name``           → TAKE LLM's when LLM provides one
  - ``agent_persona_names`` → TAKE LLM's when LLM provides any
  - ``evidence``            → COMBINE both signal trails

This preserves the persona-extraction value of Tier 2 (the whole point of
running it) while preventing a weak local model from flipping a confident
heuristic.

TDD: 4 tests added in tests/test_corpus_origin_integration.py covering
the four state combinations:

1. test_merge_tier_fields_heuristic_yes_llm_no_keeps_heuristic_bool —
   The exact failure mode Igor caught. Heuristic confidently flags
   AI-dialogue; mocked LLM contradicts. Asserts merged result keeps
   heuristic's True AND merges LLM's persona/user/platform fields.
   This test was the RED that drove the implementation.

2. test_merge_tier_fields_heuristic_no_no_personas_leak —
   Both tiers agree NOT AI-dialogue, both report empty personas. Pins
   that the merge doesn't accidentally introduce personas.

3. test_merge_tier_fields_heuristic_yes_llm_yes_combines_evidence —
   Both tiers agree AI-dialogue, LLM extracts personas. Pins that
   evidence from BOTH tiers ends up in the merged audit trail and
   persona/user/platform come from LLM.

4. test_merge_tier_fields_no_llm_provider_returns_heuristic_only —
   Backwards compat: with no LLM provider (``--no-llm`` path), the
   merge logic doesn't fire and behavior is identical to v3.3.4.

Tests: 1367 pass on the full mempalace suite. 2 pre-existing
environmental failures unrelated to this change (chromadb optional
dep). Ruff check + format both clean.
2026-04-26 13:23:38 -07:00
jp 247744296d fix(blob-seq-marker): tests + style nit per @igorls #1177 review
Three new tests cover the marker contract:

- test_fix_blob_seq_ids_writes_marker_after_blob_path — marker is
  written after a successful BLOB → INTEGER conversion.
- test_fix_blob_seq_ids_writes_marker_when_already_integer — marker
  is written even on a no-op (already-INTEGER) path. The point of
  the marker is to skip sqlite3.connect() on subsequent calls, not
  to record that a conversion happened.
- test_fix_blob_seq_ids_skips_sqlite_when_marker_present — verifies
  the load-bearing property via monkeypatched sqlite3.connect: when
  the marker exists, the function never opens sqlite. This is the
  bug #1090 fix — opening Python's sqlite3 against a live ChromaDB
  1.5.x WAL DB corrupts the next PersistentClient call, and we
  must never do it again post-migration.

Also folded in @igorls's style nit:

- Path(marker).touch() instead of open(marker, "a").close().
  Same effect, reads cleaner. Moved Path import to the top of the
  module since it didn't yet exist there.

35/35 backend tests pass. The "Production: fresh MCP server + stop
hook + mempalace mine subprocess" checkbox in the PR body is checked
in the PR reply — the canonical 151K palace has been running this
fix since #1177 was filed (via Syncthing-replicated source on the
disks daemon at v1.7.0); zero PersistentClient corruption since.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 13:21:50 -07:00
jp 942aae3ed5 fix(hnsw): address @igorls's #1173 review
Restore three `_pin_hnsw_threads` tests that the previous integrity-gate
commit deleted. The function is still live code on develop (defined at
chroma.py:207, called from chroma.py:705 + mcp_server.py), so the
deletion left the import unused (ruff F401) and dropped coverage on a
function unrelated to this PR's scope. Restored verbatim from main.

Plus three nits @igorls flagged:

- **Thread-safety doc**: `_quarantined_paths` mutation is lock-free;
  documented that idempotency of `quarantine_stale_hnsw` is the safety
  property (concurrent same-palace calls produce a benign redundant
  rename attempt that no-ops, no need for a lock).
- **Pickle protocol assumption**: `_segment_appears_healthy` requires
  PROTO ≥ 2 (`0x80`). Documented; matches what chromadb writes today,
  and a future protocol-0/1 emission would conservatively quarantine
  + lazy-rebuild rather than mis-classify as healthy.
- Class-level vs module-level scope: keeping class-level — the
  conftest reset is the controlled case, and module-level wouldn't
  remove the foot-gun, just relocate it. Conftest reset documented in
  the existing comment is the right pattern for test isolation.

Style nit (`Path(marker).touch()` vs `open(marker, "a").close()`)
deferred — that pattern lives in #1177's territory, not #1173's.

37/37 tests pass on the PR branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 13:20:14 -07:00
Igor Lins e Silva 9d18a1c32e Merge pull request #1211 from MemPalace/feat/corpus-origin
feat(init): context-aware corpus detection
2026-04-26 16:53:36 -03:00
MSL b99e54546b feat(init): context-aware corpus detection
10 files changed. 2,563 insertions, 30 deletions. 48 new tests, including end-to-end coverage live-tested with Anthropic Haiku 4.5.

This PR overhauls the first-run experience of `mempalace init` end-to-end, ships a new corpus-origin detection module from scratch, wires it into entity classification and LLM refinement, adds a graceful-fallback path that means `init` never crashes on a missing LLM, and ships a meta-test that prevents internal-coordination jargon from leaking into source or tests.

The headline change is that `mempalace init` now understands what kind of folder you're pointing it at — AI conversations, regular writing, code, narrative — and adapts how it classifies entities accordingly. The same folder containing `Echo`, `Sparrow`, and `Cipher` (names you've assigned to AI agents) used to dump those into your "people" list alongside biological humans. Now they go into a separate `agent_personas` bucket, and your `people` list stays clean.

But the broader change is that `mempalace init` got upgraded across the board — smarter defaults, smarter degradation, smarter classification, smarter persistence, and a new way to refresh as your folder grows. Built and live-verified with Anthropic Haiku 4.5; runs unmodified on the local LLM runtimes mempalace already supports.

## What changes for users (in order, from `pip install` onwards)

**Install** — `pip install mempalace` is unchanged. The package itself didn't shift.

**First run — `mempalace init <folder>`:**

1. **`init` examines your folder before classifying anything.** A free regex heuristic decides in milliseconds: AI conversations, regular writing, narrative, or code? If an LLM is reachable, a second pass extracts the corpus author's name and any agent persona names from the dialogue. v3.3.3 had no such step — it dove straight into entity detection with no corpus context.

2. **LLM-assisted classification is now ON by default.** v3.3.3 made `--llm` opt-in. The LLM-assisted path is qualitatively better (extracts persona names, refines ambiguous classifications, gives the model corpus context) so it now runs by default. The provider abstraction is unchanged from v3.3.3 — three buckets are supported by `mempalace.llm_client`:
   - **Anthropic** (`--llm-provider anthropic` + `ANTHROPIC_API_KEY`) — the official Messages API. **This is the path live-verified end-to-end in this PR with Haiku 4.5.** Cost: ~\$0.01 per `init`.
   - **Ollama** (`--llm-provider ollama` — the default) — local models via `http://localhost:11434`. Fully offline. Honors the "zero-API required" promise.
   - **OpenAI-compatible** (`--llm-provider openai-compat` + `--llm-endpoint`) — per the v3.3.3 `mempalace/llm_client.py` docstring, this covers "OpenRouter, LM Studio, llama.cpp server, vLLM, Groq, Fireworks, Together, and most self-hosted setups." We did not test each of those individually as part of this PR; the abstraction has been stable since v3.3.3. If you try this PR with a specific provider and hit a quirk, please file an issue or comment here.

3. **`init` never blocks on a missing LLM.** No Ollama running, no API key set? `init` prints a one-line message pointing at `--no-llm` and falls through to the heuristic-only path. New default behavior, new graceful fallback to support it. `--no-llm` is the new explicit opt-out.

4. **`init` shows you what it detected.** A one-line banner — `Detected: Claude (Anthropic) (user: Jordan, agents: Echo, Sparrow, Cipher)` or `Corpus origin: not AI-dialogue (confidence: 0.98)` — tells you at a glance whether mempalace understood your folder.

5. **Entity classification gets smarter across the board.** Even non-persona candidates benefit: the LLM has corpus context (this is AI-dialogue, this is the user's name, these are agent names) and uses it to disambiguate ambiguous candidates that aren't personas at all.

6. **Agent personas live in their own bucket.** Names you've assigned to AI agents (Echo, Sparrow, Cipher) go into a new `agent_personas` bucket instead of your `people` list. Your real-person entity list stays clean.

7. **Detection result persists to `<palace>/.mempalace/origin.json`** with a `schema_version: 1` envelope, so downstream tools can read it.

8. **Re-running `init` is now idempotent.** Bug fix — running `init` twice on the same folder used to give different classification results because the detection step was sampling its own `entities.json` output. Caught by integration testing during this PR.

**Later — when your folder grows:**

9. **`mempalace mine --redetect-origin`** is a new flag for refreshing the stored detection without redoing the whole `init`. Heuristic-only by design (the flag is meant to be cheap). If you want the full LLM-extracted detection refreshed (persona names, user name, etc.), run `mempalace init <yourfolder>` again — `init` is now idempotent (item 8), so re-running it on the same folder is safe.

## Behind the changes

- **New module** `mempalace/corpus_origin.py` (422 lines) with two-tier detection: regex heuristic with co-occurrence rule (suppresses ambiguous terms like `Claude` / `Gemini` / `Haiku` when no unambiguous AI signal is present, so French novels, astrology forums, poetry corpora, llama-rancher journals don't false-positive), and LLM tier that extracts `user_name` and `agent_persona_names` from dialogue structure with belt-and-suspenders user-vs-agent disambiguation.

- **Entity-classification consumer wiring.** `entity_detector.detect_entities` and `project_scanner.discover_entities` accept an optional `corpus_origin` kwarg. When present and the corpus is identified as AI-dialogue, candidates whose name case-insensitively matches an `agent_persona_name` are routed into the `agent_personas` bucket instead of `people`. Per-entity `type` is rewritten to `"agent_persona"`.

- **LLM-refine consumer wiring.** `llm_refine.refine_entities` accepts the same `corpus_origin` kwarg and prepends a `CORPUS CONTEXT` preamble to its system prompt giving the LLM the platform / user / persona context. Existing `TOPIC` / `PERSON` / `PROJECT` / `COMMON_WORD` / `AMBIGUOUS` labels are unchanged.

- **`init` overhaul.** Pass 0 (corpus-origin detection) inserted before existing Pass 1 (entity discovery). `--llm` flipped to default-on. `--no-llm` added. Graceful-fallback path replaces the previous hard-error on missing LLM. Provider precedence unchanged from the existing `llm_client` module.

- **`mine` flag.** `mempalace mine --redetect-origin` re-runs corpus-origin detection on the current corpus state and overwrites `<palace>/.mempalace/origin.json`.

- **`CLAUDE.md` design principle reworded** — "Local-first, zero external API by default." Local LLMs running on `localhost` (Ollama, LM Studio, llama.cpp, vLLM, unsloth studio) are part of the user's machine, not external APIs. External BYOK providers (Anthropic, OpenAI, Google) are supported but always opt-in, never default, never silent fallback.

## Cost story

- **Anthropic (verified path):** ~\$0.01 per `init` via Haiku 4.5 with `ANTHROPIC_API_KEY`.
- **Ollama / local LLM runtime:** zero cost. Fully offline.
- **OpenAI-compatible service:** depends entirely on the service. The abstraction supports any service speaking the standard `/v1/chat/completions` API; specific quirks vary per provider. Try it and tell us how it goes.
- **No LLM at all:** graceful fallback to heuristic-only. Zero cost. `init` never blocks.

## Backwards compatibility

- All public function signatures gained the `corpus_origin` kwarg as optional (default `None`). Callers that don't pass it see the v3.3.3 return shape unchanged — no `agent_personas` key, no behavioral change.
- The `--llm` CLI flag is preserved as a deprecated alias of the default. Existing scripts that pass it continue to work.
- `corpus_origin=None` keeps `llm_refine.SYSTEM_PROMPT` byte-identical to v3.3.3.

## Test coverage

- **19 unit tests** in `tests/test_corpus_origin.py` covering both tiers, the co-occurrence rule, ambiguous-term suppression, word-boundary brand matching, and user/persona disambiguation.
- **29 integration tests** in `tests/test_corpus_origin_integration.py` covering end-to-end through `mempalace init`, persona reclassification, the `--redetect-origin` flag, the `--llm` default flip, graceful fallback paths, and re-init idempotency. Of those 29, five specifically cover the intersection with develop's other in-flight work (Pass 0 ↔ auto-mine ordering, topics + agent_personas bucket coexistence, entities.json shape, the `wing=` kwarg threading, llm_refine TOPIC label + corpus_origin preamble composition).
- **1354 total mempalace tests pass.** 2 pre-existing environmental failures (`test_mcp_stdio_protection` — chromadb optional dep) unrelated to this change; they fail on plain `develop` too.
- **Live-smoke-tested** with real Anthropic Haiku 4.5 on AI-dialogue and narrative fixtures.

## Hygiene guardrail

This PR also adds a meta-test (`test_no_internal_coordination_jargon_in_source_or_tests`) that walks the source tree and asserts no internal-coordination jargon (e.g. development-phase markers, internal review-section references) leaks into runtime code, comments, docstrings, or LLM prompts. RED if anything slips in. Allowlist for legitimate RFC/spec section citations in `sources/`, `backends/`, `knowledge_graph.py`, and `i18n/`.
2026-04-26 12:37:26 -07:00
Igor Lins e Silva 5de5b0923d Merge pull request #936 from shaun0927/fix/diary-topic-sanitize
fix: sanitize topic parameter in tool_diary_write
2026-04-26 16:12:37 -03:00
jp 74ff5e6b98 fix(hnsw): integrity gate in quarantine_stale_hnsw — corruption vs flush-lag
Previous: quarantine fired whenever sqlite_mtime - hnsw_mtime exceeded
the (lowered, in #1173) 300s threshold. ChromaDB 1.5.x flushes HNSW
asynchronously and a clean shutdown does not force-flush, so the on-
disk HNSW is *always* meaningfully older than chroma.sqlite3 — that's
the steady state, not corruption. Quarantine renamed valid HNSW
segments on every cold-start, chromadb created empty replacements,
vector recall went to 0/N until rebuild.

Confirmed in production on the disks daemon journal, 2026-04-26
06:56:45: three of three HNSW segments quarantined on cold-start with
538-557s mtime gaps (post-clean-shutdown flush lag), leaving a
151,478-drawer palace with vector_ranked=0. Drift directories at
*.drift-20260426-065645/ each contained a complete 253MB data_level0.bin
plus 18MB index_metadata.pickle — clearly healthy indexes, renamed by
the false-positive heuristic.

Fix: two-stage gate.

  1. mtime gate (existing) — gap > stale_seconds is necessary.
  2. integrity gate (new) — sniff index_metadata.pickle for chromadb's
     expected protocol/terminator bytes (PROTO 0x80 head, STOP 0x2e
     tail) and a non-trivial size, WITHOUT deserializing the file.
     Healthy segment with mtime drift → keep in place; truncated /
     zero-filled / partial-flush → quarantine.

Format-sniff is deliberately non-deserializing — pickle deserialization
can execute arbitrary code, and the PROTO+STOP byte presence + size
floor is sufficient to distinguish a complete chromadb write from
truncation, zero-fill, or a partial flush during process kill. Real
load failures (the rare case where the bytes look right but chromadb
fails to load) still surface to palace-daemon's _auto_repair, which
calls quarantine_stale_hnsw directly on observed HNSW errors and
bypasses this gate.

The cold-start gate from 70c4bc6 (row 24) remains as a perf optimization
— even with the integrity check, repeating the sniff on every reconnect
is unnecessary work — but its load-bearing role is now covered by this
deeper fix.

4 new tests in test_backends.py:

  - test_quarantine_stale_hnsw_renames_corrupt_segment (drift + bad meta)
  - test_quarantine_stale_hnsw_leaves_healthy_segment_with_drift_alone
    (drift + valid meta — the production case at 06:24)
  - test_quarantine_stale_hnsw_leaves_segment_without_metadata_alone
    (fresh / never-flushed, no meta file)
  - test_quarantine_stale_hnsw_renames_truncated_metadata (under-floor
    size, partial-flush shape)

Existing test_quarantine_stale_hnsw_renames_drifted_segment renamed
to renames_corrupt_segment with explicit corrupt meta_bytes — the old
"renames any drift" contract is gone.

Suite 1366/1366 pass.

Coordinated cross-repo with palace-daemon's auto-repair-on-startup
workaround (separate agent's commit ed3a892). With this fork-side fix
the auto-repair becomes belt-and-suspenders; the structural cause
of empty-HNSW-on-restart is addressed at the quarantine layer.

CLAUDE.md row 26 + README fork-change-queue row + test count
1363→1366.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 09:40:25 -07:00
jp e5e7a57930 fix(hnsw): gate quarantine_stale_hnsw to cold-start, not every reconnect
Symptom on the canonical disks daemon: drift quarantines firing every
10–30 minutes throughout the day under steady write load. Logs show
.drift-* directories accumulating despite the daemon being the only
writer (no Syncthing replication of palace data).

Root cause is a false-positive thrash in the quarantine heuristic:

- chroma.sqlite3 mtime bumps on every write (millisecond cadence).
- HNSW segment files (data_level0.bin) only flush to disk on
  chromadb's internal cadence, which can lag minutes behind sqlite
  under continuous write load.

Once the gap exceeds the 300s threshold, quarantine_stale_hnsw renames
a perfectly valid HNSW segment, chromadb rebuilds it from scratch, and
the cycle repeats as soon as the next batch of writes lands. The 300s
threshold (lowered from 3600s in PR #1173 after a 0.96h-drift production
segfault) is correct for the cross-machine-replication failure mode it
was designed for, but wrong for a daemon-strict deployment whose only
"drift" source is its own benign flush lag.

Fix: gate the proactive quarantine check to the first ``make_client()``
invocation per palace per process (``ChromaBackend._quarantined_paths``
set). Real cold-start drift (replication, partial restore, crashed-mid-
write) still gets caught — that's exactly when a fresh daemon process
opens the palace. Real runtime drift on observed HNSW errors still gets
caught via palace-daemon's ``_auto_repair`` which calls
``quarantine_stale_hnsw`` directly, bypassing this gate.

Two new tests in test_backends.py verify single-fire-per-palace and
per-palace independence. Conftest clears the gate between tests.

Suite 1362/1362 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 09:40:25 -07:00
jp db80f6e26c fix: call quarantine_stale_hnsw() in make_client(); lower threshold to 5min
make_client() called _fix_blob_seq_ids but skipped quarantine_stale_hnsw,
so every fresh process (stop hook, precompact hook, CLI) opened a drifted
palace and segfaulted in chromadb_rust_bindings before any write-path
protection could fire.

#1062 wires the quarantine call at MCP server startup (covers long-lived
server processes). This fix adds it to make_client() itself — the call
site that all callers (server, hooks, CLI, tests) pass through — so every
fresh PersistentClient open is protected regardless of entry point.

Also lowers stale_seconds default from 3600 to 300: a 0.96h-drifted
segment caused production segfaults before the 1h threshold fired.
ChromaDB's HNSW flush cadence means legitimate drift is seconds to low
minutes; 5min gives headroom without admitting clearly corrupt segments.
2026-04-26 09:39:53 -07:00
Arnold Wender 0b8c2c158f fix(kg): reject inverted intervals in add_triple (valid_to < valid_from)
A triple with valid_to < valid_from satisfies neither of the temporal
filter clauses in query_entity():

    valid_from <= as_of AND valid_to >= as_of

so the triple is invisible to every query — silently corrupt. Reject
at write time with a clear error instead of letting bad data pile up
in the SQLite store.

The guard only fires when both bounds are present; open intervals
(only valid_from or only valid_to) are still accepted, and same-day
intervals (valid_from == valid_to, point-in-time facts) are explicitly
allowed.
2026-04-26 13:00:27 +02:00
Igor Lins e Silva 6890948e09 Merge pull request #1210 from MemPalace/fix/repair-extraction-cap-detection
fix(repair): refuse to overwrite when extraction looks truncated (#1208)
2026-04-26 04:49:01 -03:00
bensig 452630e927 fix(repair): refuse to overwrite when extraction looks truncated (#1208)
The user-reported case in #1208: a palace with 67,580 drawers had its
HNSW files manually quarantined to recover from corruption. ``mempalace
repair`` then ran cleanly and reported "Drawers found: 10000 ... Repair
complete. 10000 drawers rebuilt." Backup was the v3.3.3 chroma.sqlite3
that did contain the full 67,580 — but the rebuilt collection only had
the first 10K. 85% data loss, no warning.

Root cause: ChromaDB's collection-layer get() silently caps at
``CHROMADB_DEFAULT_GET_LIMIT = 10_000`` rows when reading from a
collection whose segment metadata is stale (typical post-quarantine
state). col.count() returns the same capped value, so neither the
loop bound nor the extraction count flagged the truncation.

Fix is defense-in-depth, not a recovery mechanism. Repair now:

1. After extraction, queries chroma.sqlite3 directly via a read-only
   sqlite3 connection: COUNT(*) FROM embeddings JOIN segments JOIN
   collections WHERE name='mempalace_drawers'. If that count exceeds
   the extracted count, abort with a clear message before any
   destructive operation.
2. Falls back to a weaker check when the SQLite query can't run
   (chromadb schema drift, locked file): if extracted exactly equals
   CHROMADB_DEFAULT_GET_LIMIT, that's a strong-enough cap signal to
   refuse without explicit acknowledgement.
3. Adds ``--confirm-truncation-ok`` (CLI) and ``confirm_truncation_ok``
   (rebuild_index kwarg) to override after independent verification.
   Useful for the rare case of a palace genuinely sized at exactly
   10,000 drawers.

The guard logic lives in ``repair.check_extraction_safety()`` so the
two extraction paths (CLI ``cmd_repair`` and the lower-level
``rebuild_index``) share a single implementation. Raises
``TruncationDetected`` carrying the printable message.

Tests: 9 new cases covering the safe path (counts match, SQLite
unreadable but well under cap), both abort paths (SQLite higher than
extracted, unreadable + at cap), the override flag, and end-to-end
behavior of ``rebuild_index`` with the guard wired in. Plus two
``sqlite_drawer_count`` tests for the missing-file and bad-schema
cases.

What's NOT in this PR: actually recovering the missing 57,580
drawers from the user's case. The on-disk SQLite still holds them;
recovery is a separate flow (direct-extract from chroma.sqlite3,
bypass the chromadb collection layer entirely). This PR's job is
to stop repair from making it worse.

Refs #1208.
2026-04-25 23:34:05 -07:00
Igor Lins e Silva 5e57404502 Merge pull request #935 from shaun0927/fix/repair-crash-safety
fix: guard against data loss in repair, migrate, and CLI rebuild
2026-04-25 20:49:23 -03:00
Igor Lins e Silva 34bc08fb30 Merge pull request #1205 from MemPalace/ci/python-313-windows-macos
ci: bump Windows and macOS jobs to Python 3.13
2026-04-25 20:03:12 -03:00
Igor Lins e Silva 57854c7e2e ci: bump Windows and macOS jobs to Python 3.13
3.11 is mid-life; 3.13 is already on the Linux matrix and gives ~3.5
years of upstream support. Aligns the single-version platform jobs with
the top of the Linux matrix. requires-python and lint job left alone.

Refs #1192 (Option A).
2026-04-25 19:59:05 -03:00
jp 5b07b869b0 fix(palace_graph): skip None metadata in build_graph
ChromaDB can return None for drawers without metadata (legacy data,
partial writes — same root cause as upstream #1020 / our PR #1094).
build_graph at line 95 called meta.get("room", "") unconditionally,
which AttributeErrors on None and takes out every consumer of
build_graph for the whole call path: graph_stats, find_tunnels,
traverse, and (most visibly) the daemon's /stats endpoint.

Caught 2026-04-25 by palace-daemon's verify-routes.sh smoke test
against the canonical 151K-drawer palace — /stats was 500-ing on a
single None drawer.

Adds `if meta is None: continue` guard. Closes the same gap upstream's
#999 None-metadata audit closed in searcher.py / mcp_server.py /
miner.status, just in a different file the audit didn't reach. The
graph-build is recoverable: skipping a single None drawer doesn't
distort the graph since build_graph already filters
`room and room != "general" and wing` — a missing-metadata drawer was
never going to participate anyway.

Test: TestBuildGraph::test_none_metadata_does_not_crash mixes a None
entry into a 3-drawer fixture and asserts the two real drawers are
processed normally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 11:06:32 -07:00
jp bc24aa14e2 fix: skip _fix_blob_seq_ids sqlite open on already-migrated palaces (#1090)
Opening chroma.sqlite3 via Python's sqlite3.connect() against a live
ChromaDB 1.5.x WAL-mode database leaves state that segfaults the next
PersistentClient call — the same failure mode tracked at #1090.

_fix_blob_seq_ids runs unconditionally on every make_client() call, so
every fresh process (MCP server, stop hook, CLI) re-triggers the sqlite
open → corrupt → segfault cycle on palaces that have already completed
the 0.6.x → 1.5.x seq_id migration.

Guard with a .blob_seq_ids_migrated marker file in the palace directory:
- If marker exists, return immediately — skip sqlite entirely
- After successful migration (or confirmation that no BLOBs remain),
  write the marker so subsequent opens take the fast path
- Palaces that never had BLOB seq_ids also get the marker on first open,
  so they too avoid the redundant sqlite open after that
- Already-migrated palaces can touch the marker manually to opt in

Test plan: Direct test — run _fix_blob_seq_ids twice against a fresh
palace; second call returns immediately because marker exists. 1094
existing tests pass.
2026-04-25 07:42:05 -07:00
jp 67248330c5 chore: ruff format tests/test_searcher.py
CI lint job runs `ruff format --check`; the new tests in TestBM25NoneSafety
needed the standard "blank line after import-inside-function" + line-length
wrap. No logic change — formatter pass only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 07:22:53 -07:00
jp ee12c07c54 fix(searcher): tolerate None documents in BM25 reranker
`_tokenize` calls `text.lower()` unconditionally; when ChromaDB returns a
drawer with `documents` containing `None`, the hybrid-rerank path raises
`AttributeError: 'NoneType' object has no attribute 'lower'`.

Observed in production daemon log (2026-04-24 21:07:05) during a search
that triggered `_hybrid_rank → _bm25_scores → _tokenize`:

    File "mempalace/searcher.py", line 81, in _bm25_scores
        tokenized = [_tokenize(d) for d in documents]
    File "mempalace/searcher.py", line 52, in _tokenize
        return _TOKEN_RE.findall(text.lower())
    AttributeError: 'NoneType' object has no attribute 'lower'

Closes the gap left by the upstream None-metadata audit (#999), which
covered metadata loops but not BM25 helpers. Returns `[]` for falsy input
so a None doc gets score 0.0 while the rest of the corpus reranks normally.

Three regression tests in TestBM25NoneSafety lock the behavior and reference
the production trace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 06:27:14 -07:00
Igor Lins e Silva 0d9929c0dd Merge pull request #976 from felipetruman/fix/hnsw-race-and-fanout
fix: HNSW graph corruption, PreCompact deadlock, mine fan-out (closes #974, #965, #955)
2026-04-25 05:00:14 -03:00
Igor Lins e Silva 7773432bca chore(rebase): reconcile with develop and apply ruff format
After rebasing onto current develop:
- chroma.py: keep develop's quarantine_stale_hnsw + UnsupportedFilterError
  validation alongside this PR's _pin_hnsw_threads retrofit.
- tests/test_backends.py: combine quarantine_stale_hnsw and
  _pin_hnsw_threads test sections; ruff format.
- miner.py: propagate the new `files=` kwarg (added on develop in #1183
  for the init -> mine flow) through _mine_impl so the caller can pass
  a pre-scanned file list under the global lock.
2026-04-25 04:39:31 -03:00
Felipe Truman 8df944a54d fix: best-effort HNSW thread-pin retrofit + drop dead attempt-cap constant
Addresses remaining PR #976 review items after rebase on develop.

`get_collection(create=False)` previously returned existing collections without
re-applying `hnsw:num_threads=1`, so palaces created before the fix kept the
unsafe parallel-insert path. Add `_pin_hnsw_threads()` helper that calls
`collection.modify(configuration=UpdateCollectionConfiguration(
hnsw=UpdateHNSWConfiguration(num_threads=1)))` best-effort on every
`get_collection` call (including the MCP server's `_get_collection`).

In chromadb 1.5.x the runtime config does not persist to disk across
`PersistentClient` reopens, so the retrofit is re-applied each process start
rather than being a one-shot migration. Fresh palaces keep the metadata-based
pin as primary defense; legacy palaces now also get per-session protection
without requiring `mempalace nuke` + re-mine.

After the rebase on develop, `hook_precompact` delegates to `_mine_sync` and
no longer emits `decision: block`, so the attempt-cap constant was orphaned.
Grep confirms 0 usages in the repo — remove it.

- `_pin_hnsw_threads` retrofits legacy collection (num_threads None -> 1)
- `_pin_hnsw_threads` swallows all errors (never raises)
- `ChromaBackend.get_collection(create=False)` applies retrofit on legacy palace
- 62 tests pass (10 backends + 6 palace locks + 46 hooks_cli)
2026-04-25 04:36:29 -03:00
Felipe Truman 40d7958ca1 test: remove attempt-cap tests obsoleted by develop's pass-through approach
PR #863 on develop eliminated precompact blocking entirely. After rebasing,
the attempt-cap tests (test_precompact_first_two_attempts_block,
test_precompact_passes_through_after_cap, test_precompact_counter_is_per_session)
would always fail because hook_precompact now mines synchronously and
passes through unconditionally. Remove them to keep the suite green.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-25 04:34:30 -03:00
Felipe Truman 1998aede66 fix: Windows CI compat for palace lock tests and path normalization
Addresses the two actionable Copilot comments from the 2nd review pass.

tests/test_palace_locks.py (#7, #8)
  multiprocessing.get_context("fork") is unavailable on Windows, so the
  cross-process tests would crash the Windows CI runner. Added
  `_get_mp_context()` that picks "spawn" on Windows and "fork" elsewhere.
  Spawn re-imports the module in the child; it inherits os.environ
  (including the monkeypatched HOME), which is all these tests need.

mempalace/palace.py (#10)
  The per-palace lock key was computed from os.path.abspath(palace_path).
  On Windows the filesystem is case-insensitive, so `C:\\Palace` and
  `c:\\palace` would hash to different keys and two concurrent mines
  could touch the same on-disk palace. Switched to
  `os.path.normcase(os.path.realpath(...))` so:
    * realpath resolves symlinks and `..` segments
    * normcase folds case on Windows (no-op on POSIX)

Testing
  pytest tests/test_palace_locks.py tests/test_hooks_cli.py
         tests/test_backends.py tests/test_cli.py
  → 98 passed, 0 failed.
2026-04-25 04:34:30 -03:00
Felipe Truman 99b820cb42 fix: address PR review — per-palace lock, MCP server path, hook timeout, tests
Addresses the six Copilot review comments on the initial commit.

1) #6 (critical) — mcp_server.py `_get_collection` bypassed ChromaBackend

   The MCP server creates its palace collection directly via
   `chromadb.PersistentClient.get_or_create_collection` in `_get_collection`,
   not through `ChromaBackend.get_collection`. That path was missing the
   `hnsw:num_threads=1` metadata, so the primary crash surface for #974
   and #965 was untouched by the original patch. Fixed by passing
   `hnsw:num_threads=1` at the mcp_server create site too. Documented
   in a code comment that the setting is only honored at creation
   time — existing palaces created before this fix still need a
   `mempalace nuke` + re-mine to gain the protection.

2) #3 — mine_global_lock over-serialized mines across unrelated palaces

   Replaced the single global lock file `mine_global.lock` with a
   per-palace lock keyed by `sha256(os.path.abspath(palace_path))`
   (`mine_palace_<hash>.lock`). Mines against the same palace still
   collapse to a single runner (the correctness boundary), but mines
   against *different* palaces are now free to run in parallel.
   `mine_global_lock` is kept as a backward-compatible alias for
   `mine_palace_lock` so any external callers that imported the
   previous name keep working.

3) #1 — hook_precompact swallowed OSError but not subprocess.TimeoutExpired

   `subprocess.run(..., timeout=60)` raises `TimeoutExpired` on slow
   palaces. The previous `except OSError` clause didn't catch it, so
   the hook could raise and fail to emit any JSON decision — leaving
   the harness without a block/passthrough signal. Fixed by catching
   `(OSError, subprocess.TimeoutExpired)` together and always falling
   through to the block decision so the hook reliably emits a response.

4) #2 + #4 — tests

   - tests/test_hooks_cli.py: added
     `test_precompact_first_two_attempts_block`,
     `test_precompact_passes_through_after_cap`, and
     `test_precompact_counter_is_per_session` to lock in the #955
     deadlock fix.
   - tests/test_palace_locks.py (new): covers `mine_palace_lock`
     single-acquire, reuse-after-release, cross-process serialization
     on the same palace, non-interference across different palaces,
     path normalization, and the `mine_global_lock` back-compat alias.

5) #5 — known limitation, documented but not auto-fixed

   Copilot suggested detecting collections missing `hnsw:num_threads=1`
   and calling `collection.modify(metadata=...)` to retrofit existing
   palaces. Verified against chromadb 1.5.7: `modify(metadata=...)`
   replaces metadata rather than merging, and re-passing
   `hnsw:space="cosine"` then raises `ValueError: Changing the
   distance function of a collection once it is created is not
   supported currently.` The HNSW runtime configuration
   (`configuration_json`) also does not expose `num_threads` in
   chromadb 1.5.x, so the flag appears to be read only at creation
   time. Rather than paper over the limitation with a best-effort
   `modify` that silently drops `hnsw:space`, documented in the
   mcp_server comment that pre-existing palaces need a
   `mempalace nuke` + re-mine to gain the protection. Fresh palaces
   are always protected.

Testing
- pytest tests/test_palace_locks.py tests/test_hooks_cli.py
  tests/test_backends.py tests/test_cli.py → **98 passed, 0 failed**.
- Runtime validation with two concurrent `mempalace mine` calls:
  - Different palaces → both complete in parallel ✓
  - Same palace     → one completes, the other exits with
    "another `mine` is already running against <palace> — exiting
    cleanly." ✓
2026-04-25 04:34:30 -03:00
Felipe Truman 7e18a70796 fix: resolve hooks_cli.py merge conflict + add mine_global_lock tests
- Resolve UU conflict in hooks_cli.py: take develop/HEAD approach
  (mine synchronously via _mine_sync, then pass through unconditionally).
  _mine_sync already catches subprocess.TimeoutExpired — fixes Copilot #1.
- Add tests/test_palace_locks.py: 4 tests covering mine_global_lock
  non-blocking semantics (acquire, second-acquire raises MineAlreadyRunning,
  reusable after release, release on exception) — fixes Copilot #4.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-25 04:34:30 -03:00
Igor Lins e Silva 91a60263e3 Merge pull request #1168 from arnoldwender/fix/security-tunnels-permissions
fix(security): restrict tunnels.json file permissions
2026-04-25 04:21:44 -03:00
Igor Lins e Silva 320aab31e3 Merge pull request #939 from mvalentsev/ci/pip-cache-and-python-bump
ci: add pip caching and bump Python on macOS/Windows
2026-04-25 04:12:57 -03:00
Igor Lins e Silva 5ed24ad061 Merge pull request #969 from MemPalace/dependabot/github_actions/actions/checkout-6
chore(deps): bump actions/checkout from 4 to 6
2026-04-25 04:04:37 -03:00
Igor Lins e Silva 036742e888 Merge pull request #968 from MemPalace/dependabot/github_actions/actions/upload-pages-artifact-5
chore(deps): bump actions/upload-pages-artifact from 3 to 5
2026-04-25 04:04:18 -03:00
Igor Lins e Silva 000acc1e33 Merge pull request #967 from MemPalace/dependabot/github_actions/actions/deploy-pages-5
chore(deps): bump actions/deploy-pages from 4 to 5
2026-04-25 04:03:58 -03:00
Igor Lins e Silva 9b7536a1f7 Merge pull request #1101 from wahajahmed010/fix/hooks-tutorial-1037
docs: fix HOOKS_TUTORIAL.md paths, matcher, and missing timeout (#1037)
2026-04-25 03:48:36 -03:00
Igor Lins e Silva 374fb5656b Merge pull request #1183 from MemPalace/feat/init-mine-ux
feat(cli): init prompts to mine, mine handles Ctrl-C gracefully (#1181, #1182)
2026-04-25 01:24:23 -03:00
Igor Lins e Silva c4eeec8642 test: use shlex.quote in resume-hint assertions for Windows
The pre-existing test_maybe_run_mine_prompt_declined_prints_hint
asserted the bare unquoted form `mempalace mine {tmp_path}`. After
the production code switched to shlex.quote on the resume hint, this
passed on Linux/macOS (POSIX paths have no characters that trigger
quoting) but failed on Windows where backslashes always get wrapped
in single quotes.

Mirror the production code in the assertion via shlex.quote so it's
portable across platforms; do the same for the two new
spaces-in-path tests for consistency.
2026-04-25 01:18:31 -03:00