b99e54546b
10 files changed. 2,563 insertions, 30 deletions. 48 new tests, including end-to-end coverage live-tested with Anthropic Haiku 4.5. This PR overhauls the first-run experience of `mempalace init` end-to-end, ships a new corpus-origin detection module from scratch, wires it into entity classification and LLM refinement, adds a graceful-fallback path that means `init` never crashes on a missing LLM, and ships a meta-test that prevents internal-coordination jargon from leaking into source or tests. The headline change is that `mempalace init` now understands what kind of folder you're pointing it at — AI conversations, regular writing, code, narrative — and adapts how it classifies entities accordingly. The same folder containing `Echo`, `Sparrow`, and `Cipher` (names you've assigned to AI agents) used to dump those into your "people" list alongside biological humans. Now they go into a separate `agent_personas` bucket, and your `people` list stays clean. But the broader change is that `mempalace init` got upgraded across the board — smarter defaults, smarter degradation, smarter classification, smarter persistence, and a new way to refresh as your folder grows. Built and live-verified with Anthropic Haiku 4.5; runs unmodified on the local LLM runtimes mempalace already supports. ## What changes for users (in order, from `pip install` onwards) **Install** — `pip install mempalace` is unchanged. The package itself didn't shift. **First run — `mempalace init <folder>`:** 1. **`init` examines your folder before classifying anything.** A free regex heuristic decides in milliseconds: AI conversations, regular writing, narrative, or code? If an LLM is reachable, a second pass extracts the corpus author's name and any agent persona names from the dialogue. v3.3.3 had no such step — it dove straight into entity detection with no corpus context. 2. **LLM-assisted classification is now ON by default.** v3.3.3 made `--llm` opt-in. The LLM-assisted path is qualitatively better (extracts persona names, refines ambiguous classifications, gives the model corpus context) so it now runs by default. The provider abstraction is unchanged from v3.3.3 — three buckets are supported by `mempalace.llm_client`: - **Anthropic** (`--llm-provider anthropic` + `ANTHROPIC_API_KEY`) — the official Messages API. **This is the path live-verified end-to-end in this PR with Haiku 4.5.** Cost: ~\$0.01 per `init`. - **Ollama** (`--llm-provider ollama` — the default) — local models via `http://localhost:11434`. Fully offline. Honors the "zero-API required" promise. - **OpenAI-compatible** (`--llm-provider openai-compat` + `--llm-endpoint`) — per the v3.3.3 `mempalace/llm_client.py` docstring, this covers "OpenRouter, LM Studio, llama.cpp server, vLLM, Groq, Fireworks, Together, and most self-hosted setups." We did not test each of those individually as part of this PR; the abstraction has been stable since v3.3.3. If you try this PR with a specific provider and hit a quirk, please file an issue or comment here. 3. **`init` never blocks on a missing LLM.** No Ollama running, no API key set? `init` prints a one-line message pointing at `--no-llm` and falls through to the heuristic-only path. New default behavior, new graceful fallback to support it. `--no-llm` is the new explicit opt-out. 4. **`init` shows you what it detected.** A one-line banner — `Detected: Claude (Anthropic) (user: Jordan, agents: Echo, Sparrow, Cipher)` or `Corpus origin: not AI-dialogue (confidence: 0.98)` — tells you at a glance whether mempalace understood your folder. 5. **Entity classification gets smarter across the board.** Even non-persona candidates benefit: the LLM has corpus context (this is AI-dialogue, this is the user's name, these are agent names) and uses it to disambiguate ambiguous candidates that aren't personas at all. 6. **Agent personas live in their own bucket.** Names you've assigned to AI agents (Echo, Sparrow, Cipher) go into a new `agent_personas` bucket instead of your `people` list. Your real-person entity list stays clean. 7. **Detection result persists to `<palace>/.mempalace/origin.json`** with a `schema_version: 1` envelope, so downstream tools can read it. 8. **Re-running `init` is now idempotent.** Bug fix — running `init` twice on the same folder used to give different classification results because the detection step was sampling its own `entities.json` output. Caught by integration testing during this PR. **Later — when your folder grows:** 9. **`mempalace mine --redetect-origin`** is a new flag for refreshing the stored detection without redoing the whole `init`. Heuristic-only by design (the flag is meant to be cheap). If you want the full LLM-extracted detection refreshed (persona names, user name, etc.), run `mempalace init <yourfolder>` again — `init` is now idempotent (item 8), so re-running it on the same folder is safe. ## Behind the changes - **New module** `mempalace/corpus_origin.py` (422 lines) with two-tier detection: regex heuristic with co-occurrence rule (suppresses ambiguous terms like `Claude` / `Gemini` / `Haiku` when no unambiguous AI signal is present, so French novels, astrology forums, poetry corpora, llama-rancher journals don't false-positive), and LLM tier that extracts `user_name` and `agent_persona_names` from dialogue structure with belt-and-suspenders user-vs-agent disambiguation. - **Entity-classification consumer wiring.** `entity_detector.detect_entities` and `project_scanner.discover_entities` accept an optional `corpus_origin` kwarg. When present and the corpus is identified as AI-dialogue, candidates whose name case-insensitively matches an `agent_persona_name` are routed into the `agent_personas` bucket instead of `people`. Per-entity `type` is rewritten to `"agent_persona"`. - **LLM-refine consumer wiring.** `llm_refine.refine_entities` accepts the same `corpus_origin` kwarg and prepends a `CORPUS CONTEXT` preamble to its system prompt giving the LLM the platform / user / persona context. Existing `TOPIC` / `PERSON` / `PROJECT` / `COMMON_WORD` / `AMBIGUOUS` labels are unchanged. - **`init` overhaul.** Pass 0 (corpus-origin detection) inserted before existing Pass 1 (entity discovery). `--llm` flipped to default-on. `--no-llm` added. Graceful-fallback path replaces the previous hard-error on missing LLM. Provider precedence unchanged from the existing `llm_client` module. - **`mine` flag.** `mempalace mine --redetect-origin` re-runs corpus-origin detection on the current corpus state and overwrites `<palace>/.mempalace/origin.json`. - **`CLAUDE.md` design principle reworded** — "Local-first, zero external API by default." Local LLMs running on `localhost` (Ollama, LM Studio, llama.cpp, vLLM, unsloth studio) are part of the user's machine, not external APIs. External BYOK providers (Anthropic, OpenAI, Google) are supported but always opt-in, never default, never silent fallback. ## Cost story - **Anthropic (verified path):** ~\$0.01 per `init` via Haiku 4.5 with `ANTHROPIC_API_KEY`. - **Ollama / local LLM runtime:** zero cost. Fully offline. - **OpenAI-compatible service:** depends entirely on the service. The abstraction supports any service speaking the standard `/v1/chat/completions` API; specific quirks vary per provider. Try it and tell us how it goes. - **No LLM at all:** graceful fallback to heuristic-only. Zero cost. `init` never blocks. ## Backwards compatibility - All public function signatures gained the `corpus_origin` kwarg as optional (default `None`). Callers that don't pass it see the v3.3.3 return shape unchanged — no `agent_personas` key, no behavioral change. - The `--llm` CLI flag is preserved as a deprecated alias of the default. Existing scripts that pass it continue to work. - `corpus_origin=None` keeps `llm_refine.SYSTEM_PROMPT` byte-identical to v3.3.3. ## Test coverage - **19 unit tests** in `tests/test_corpus_origin.py` covering both tiers, the co-occurrence rule, ambiguous-term suppression, word-boundary brand matching, and user/persona disambiguation. - **29 integration tests** in `tests/test_corpus_origin_integration.py` covering end-to-end through `mempalace init`, persona reclassification, the `--redetect-origin` flag, the `--llm` default flip, graceful fallback paths, and re-init idempotency. Of those 29, five specifically cover the intersection with develop's other in-flight work (Pass 0 ↔ auto-mine ordering, topics + agent_personas bucket coexistence, entities.json shape, the `wing=` kwarg threading, llm_refine TOPIC label + corpus_origin preamble composition). - **1354 total mempalace tests pass.** 2 pre-existing environmental failures (`test_mcp_stdio_protection` — chromadb optional dep) unrelated to this change; they fail on plain `develop` too. - **Live-smoke-tested** with real Anthropic Haiku 4.5 on AI-dialogue and narrative fixtures. ## Hygiene guardrail This PR also adds a meta-test (`test_no_internal_coordination_jargon_in_source_or_tests`) that walks the source tree and asserts no internal-coordination jargon (e.g. development-phase markers, internal review-section references) leaks into runtime code, comments, docstrings, or LLM prompts. RED if anything slips in. Allowlist for legitimate RFC/spec section citations in `sources/`, `backends/`, `knowledge_graph.py`, and `i18n/`.
1391 lines
55 KiB
Python
1391 lines
55 KiB
Python
"""Integration tests proving corpus_origin actually improves classification.
|
||
|
||
These are the tests that justify the PR. Without them, the PR ships
|
||
infrastructure that nobody can prove improves v3.3.3.
|
||
|
||
The fixture: a small AI-dialogue corpus with three agent persona names
|
||
(Echo, Sparrow, Cipher) that the user (Jordan) has assigned to their AI
|
||
agents. On plain v3.3.3, entity_detector misclassifies these as PEOPLE.
|
||
With corpus_origin context wired through, they classify as
|
||
AGENT_PERSONA instead.
|
||
|
||
Two tests sit side by side:
|
||
|
||
test_baseline_v333_misclassifies_persona_names_as_people
|
||
Pins v3.3.3's behavior. If this starts failing, the PR's motivation
|
||
has shifted and the corpus_origin docs need revisiting.
|
||
|
||
test_corpus_origin_reclassifies_personas
|
||
The fix. Asserts that when corpus_origin context is passed,
|
||
persona names land in agent_personas instead of people.
|
||
|
||
Together: documented before/after of v3.3.3 → corpus-origin feature.
|
||
"""
|
||
|
||
from __future__ import annotations
|
||
|
||
import argparse
|
||
import json
|
||
from pathlib import Path
|
||
from unittest.mock import MagicMock, patch
|
||
|
||
import pytest
|
||
|
||
|
||
# A synthetic but realistic Claude Code transcript fixture. Three persona
|
||
# names appear repeatedly in dialogue patterns that the v3.3.3
|
||
# entity_detector treats as person-evidence (dialogue markers, action verbs,
|
||
# pronoun proximity). User name "Jordan" also appears in dialogue.
|
||
#
|
||
# The point is: every name here CAN be a real human name. v3.3.3 has no
|
||
# way to know that in this corpus they're agent personas, not people. The
|
||
# corpus_origin gives it that context.
|
||
AI_DIALOGUE_FIXTURE = """\
|
||
# Session log — 2026-04-20
|
||
|
||
Jordan: Echo, can you summarize what we worked on yesterday?
|
||
|
||
Echo (assistant): Yesterday we refactored the embedding pipeline. I noticed
|
||
the chunking strategy was producing overlapping windows, and I suggested
|
||
moving to a sliding window with explicit stride. You agreed and we shipped
|
||
the change.
|
||
|
||
Jordan: Good. Sparrow, what about the migration script — did you finish?
|
||
|
||
Sparrow (assistant): Yes, I finished the migration. I tested it locally
|
||
against the staging snapshot and it ran clean. I also added a rollback
|
||
path because you asked me to be cautious about the indexes.
|
||
|
||
Jordan: Perfect. Cipher, run the verification suite please.
|
||
|
||
Cipher (assistant): Running now. I'll report back when the full suite
|
||
completes. I expect it to take about four minutes.
|
||
|
||
Echo: Jordan, while Cipher runs the verification, do you want me to draft
|
||
the changelog entry for today's work?
|
||
|
||
Jordan: Yes please. Echo, keep it short. Sparrow, please review Echo's
|
||
draft when she sends it.
|
||
|
||
Sparrow: Will do. I'll look for clarity issues and check the migration
|
||
phrasing matches what we actually shipped.
|
||
|
||
Cipher: Verification complete. All 1247 tests pass. I'm filing the run log
|
||
to the palace under wing/today.
|
||
|
||
Jordan: Thanks Cipher. Echo, send the changelog draft.
|
||
|
||
Echo: Done. Sent to the channel. Sparrow, ready for review when you are.
|
||
|
||
Sparrow: Reviewed. Two small wording changes — sent back. Otherwise clean.
|
||
|
||
Jordan: Echo, apply Sparrow's edits and ship it.
|
||
|
||
Echo: Shipped. Tag pushed.
|
||
"""
|
||
|
||
|
||
@pytest.fixture
|
||
def ai_dialogue_corpus(tmp_path: Path) -> Path:
|
||
"""Create a one-file project directory containing the AI-dialogue fixture."""
|
||
project_dir = tmp_path / "ai_dialogue_project"
|
||
project_dir.mkdir()
|
||
(project_dir / "session_log.md").write_text(AI_DIALOGUE_FIXTURE)
|
||
return project_dir
|
||
|
||
|
||
@pytest.fixture
|
||
def corpus_origin_for_fixture() -> dict:
|
||
"""The corpus_origin result a context-aware init would produce for the fixture."""
|
||
return {
|
||
"schema_version": 1,
|
||
"detected_at": "2026-04-26T00:00:00Z",
|
||
"result": {
|
||
"likely_ai_dialogue": True,
|
||
"confidence": 0.95,
|
||
"primary_platform": "Claude (Anthropic)",
|
||
"user_name": "Jordan",
|
||
"agent_persona_names": ["Echo", "Sparrow", "Cipher"],
|
||
"evidence": ["Synthetic fixture for the integration test"],
|
||
},
|
||
}
|
||
|
||
|
||
# ── Baseline test: pin v3.3.3 behavior ────────────────────────────────────
|
||
|
||
|
||
def test_baseline_v333_misclassifies_persona_names_as_people(ai_dialogue_corpus: Path):
|
||
"""Without corpus_origin context, v3.3.3 entity_detector cannot
|
||
distinguish agent persona names from real people, and classifies them
|
||
into the 'people' bucket.
|
||
|
||
This test pins that behavior. Its purpose is documentation —
|
||
The corpus-origin feature's job is to fix this, and the post-fix test below
|
||
asserts the fix.
|
||
"""
|
||
from mempalace.entity_detector import detect_entities, scan_for_detection
|
||
|
||
files = scan_for_detection(str(ai_dialogue_corpus))
|
||
detected = detect_entities(files)
|
||
|
||
people_names = {e["name"] for e in detected.get("people", [])}
|
||
uncertain_names = {e["name"] for e in detected.get("uncertain", [])}
|
||
all_classified = people_names | uncertain_names
|
||
|
||
# Persona names appear somewhere in the detection output (people or uncertain).
|
||
# If none of them surface at all, the fixture is no longer triggering
|
||
# the misclassification path and the test is no longer meaningful.
|
||
persona_names = {"Echo", "Sparrow", "Cipher"}
|
||
persona_hits = persona_names & all_classified
|
||
assert persona_hits, (
|
||
"Fixture no longer surfaces persona names as detected entities. "
|
||
"Update the fixture to keep this test meaningful."
|
||
)
|
||
|
||
# No agent_personas bucket exists on v3.3.3.
|
||
assert "agent_personas" not in detected, (
|
||
"v3.3.3 has no concept of agent_personas — if this key exists, "
|
||
"corpus-origin wiring has already shipped and this baseline test is stale."
|
||
)
|
||
|
||
|
||
# ── corpus-origin test: with corpus_origin, personas reclassify ───────────
|
||
|
||
|
||
def test_corpus_origin_reclassifies_personas(
|
||
ai_dialogue_corpus: Path, corpus_origin_for_fixture: dict
|
||
):
|
||
"""When corpus_origin context is passed to detect_entities, names
|
||
matching agent_persona_names land in an 'agent_personas' bucket
|
||
instead of being misclassified as people.
|
||
|
||
This is the fix. RED until the consumer wiring lands.
|
||
"""
|
||
from mempalace.entity_detector import detect_entities, scan_for_detection
|
||
|
||
files = scan_for_detection(str(ai_dialogue_corpus))
|
||
detected = detect_entities(files, corpus_origin=corpus_origin_for_fixture)
|
||
|
||
# New bucket exists.
|
||
assert "agent_personas" in detected, (
|
||
"The corpus-origin wiring must add an 'agent_personas' bucket to the detect_entities "
|
||
"return shape when corpus_origin is provided."
|
||
)
|
||
|
||
persona_names_in_bucket = {e["name"] for e in detected["agent_personas"]}
|
||
persona_names_in_people = {e["name"] for e in detected.get("people", [])}
|
||
|
||
# All three personas land in the new bucket.
|
||
expected_personas = {"Echo", "Sparrow", "Cipher"}
|
||
assert expected_personas <= persona_names_in_bucket, (
|
||
f"Expected all three personas in agent_personas, got: " f"{persona_names_in_bucket}"
|
||
)
|
||
|
||
# And NONE of them remain in the people bucket.
|
||
leaked = expected_personas & persona_names_in_people
|
||
assert not leaked, (
|
||
f"Persona names {leaked} leaked into 'people' bucket — the corpus-origin "
|
||
f"consumer wiring is supposed to filter them out."
|
||
)
|
||
|
||
|
||
# ── discover_entities (project_scanner) threads corpus_origin ─────────────
|
||
|
||
|
||
def test_discover_entities_threads_corpus_origin_through(
|
||
ai_dialogue_corpus: Path, corpus_origin_for_fixture: dict
|
||
):
|
||
"""discover_entities is the higher-level entry point cmd_init uses.
|
||
It must accept corpus_origin and produce the same persona reclassification
|
||
that detect_entities does, regardless of whether candidates entered via
|
||
prose, manifests, or git authors.
|
||
"""
|
||
from mempalace.project_scanner import discover_entities
|
||
|
||
detected = discover_entities(
|
||
str(ai_dialogue_corpus),
|
||
corpus_origin=corpus_origin_for_fixture,
|
||
)
|
||
|
||
persona_names_in_bucket = {e["name"] for e in detected.get("agent_personas", [])}
|
||
persona_names_in_people = {e["name"] for e in detected.get("people", [])}
|
||
expected_personas = {"Echo", "Sparrow", "Cipher"}
|
||
|
||
# All personas surface in the agent_personas bucket via discover_entities too.
|
||
assert expected_personas <= persona_names_in_bucket, (
|
||
f"discover_entities did not thread corpus_origin to detect_entities. "
|
||
f"Expected {expected_personas} in agent_personas, got: "
|
||
f"{persona_names_in_bucket}"
|
||
)
|
||
|
||
leaked = expected_personas & persona_names_in_people
|
||
assert not leaked, f"discover_entities leaked persona names into 'people': {leaked}"
|
||
|
||
|
||
def test_discover_entities_no_origin_unchanged_shape(ai_dialogue_corpus: Path):
|
||
"""Backwards compatibility: when corpus_origin is omitted, the return
|
||
shape stays exactly what it was on v3.3.3 (no agent_personas key).
|
||
Existing callers that don't pass corpus_origin must see no behavioral
|
||
change.
|
||
"""
|
||
from mempalace.project_scanner import discover_entities
|
||
|
||
detected = discover_entities(str(ai_dialogue_corpus))
|
||
|
||
# No new bucket appears unsolicited.
|
||
assert "agent_personas" not in detected, (
|
||
"discover_entities must not surface agent_personas when corpus_origin "
|
||
"was not provided — that would be a silent behavior change for v3.3.3 "
|
||
"callers who don't know about the corpus-origin feature."
|
||
)
|
||
|
||
|
||
# ── Pass 0 — cmd_init runs corpus_origin and writes origin.json ──────────
|
||
|
||
|
||
def _stub_cfg(palace_dir: Path):
|
||
"""Build a MempalaceConfig stub whose palace_path points at tmp space.
|
||
|
||
Used by Pass 0 tests so the origin.json write is captured in tmp_path
|
||
instead of hitting the real ~/.mempalace location.
|
||
"""
|
||
cfg = MagicMock()
|
||
cfg.palace_path = str(palace_dir)
|
||
cfg.entity_languages = ["en"]
|
||
return cfg
|
||
|
||
|
||
def test_init_pass_zero_writes_origin_json_to_palace(ai_dialogue_corpus: Path, tmp_path: Path):
|
||
"""cmd_init must run corpus_origin detection BEFORE entity detection
|
||
and persist the result to ``<palace>/.mempalace/origin.json`` in the
|
||
documented schema_version=1 wrapper.
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
|
||
palace = tmp_path / "palace"
|
||
# no_llm=True isolates the test from any local LLM provider. With Ollama
|
||
# running locally and a small default model, Tier 2 can return a wrong
|
||
# classification that overrides the correct heuristic answer (Igor's PR
|
||
# #1211 review). The test asserts on heuristic behavior, so Tier 2 must
|
||
# not fire.
|
||
args = argparse.Namespace(dir=str(ai_dialogue_corpus), yes=True, no_llm=True)
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli._maybe_run_mine_after_init"),
|
||
patch("mempalace.room_detector_local.detect_rooms_local"),
|
||
):
|
||
cmd_init(args)
|
||
|
||
origin_path = palace / ".mempalace" / "origin.json"
|
||
assert origin_path.exists(), (
|
||
f"Pass 0 did not write {origin_path}. cmd_init is supposed to call "
|
||
f"corpus_origin detection and persist the result before entity detection."
|
||
)
|
||
|
||
data = json.loads(origin_path.read_text())
|
||
assert data.get("schema_version") == 1, (
|
||
"origin.json must declare schema_version=1 so future format changes "
|
||
"are detectable. Got: " + repr(data.get("schema_version"))
|
||
)
|
||
assert "detected_at" in data, "origin.json must include a detected_at timestamp"
|
||
assert "result" in data, "origin.json must wrap the CorpusOriginResult under 'result'"
|
||
assert isinstance(data["result"].get("likely_ai_dialogue"), bool)
|
||
# Fixture is heavy AI-dialogue — heuristic should classify as such.
|
||
assert data["result"]["likely_ai_dialogue"] is True, (
|
||
"Heuristic should classify the AI-dialogue fixture as AI-dialogue. "
|
||
f"Got: {data['result']}"
|
||
)
|
||
|
||
|
||
def test_init_pass_zero_passes_corpus_origin_to_discover_entities(
|
||
ai_dialogue_corpus: Path, tmp_path: Path
|
||
):
|
||
"""The Pass 0 result must reach discover_entities via the corpus_origin
|
||
kwarg — that's what enables persona reclassification end-to-end.
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
|
||
palace = tmp_path / "palace"
|
||
# no_llm=True isolates the test from any local LLM provider — see note
|
||
# on test_init_pass_zero_writes_origin_json_to_palace.
|
||
args = argparse.Namespace(dir=str(ai_dialogue_corpus), yes=True, no_llm=True)
|
||
|
||
captured = {}
|
||
|
||
def fake_discover(project_dir, **kwargs):
|
||
captured["kwargs"] = kwargs
|
||
return {"people": [], "projects": [], "uncertain": []}
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.project_scanner.discover_entities", side_effect=fake_discover),
|
||
patch("mempalace.cli._maybe_run_mine_after_init"),
|
||
patch("mempalace.room_detector_local.detect_rooms_local"),
|
||
):
|
||
cmd_init(args)
|
||
|
||
assert "corpus_origin" in captured.get("kwargs", {}), (
|
||
"cmd_init did not pass corpus_origin to discover_entities. The Pass 0 "
|
||
"detection result must be threaded into entity detection so persona "
|
||
"reclassification happens end-to-end."
|
||
)
|
||
origin = captured["kwargs"]["corpus_origin"]
|
||
assert origin is not None, (
|
||
"corpus_origin kwarg was passed but value was None — Pass 0 should "
|
||
"supply the actual detection result for AI-dialogue corpora."
|
||
)
|
||
assert origin.get("schema_version") == 1
|
||
assert "result" in origin
|
||
|
||
|
||
def test_init_pass_zero_skipped_when_no_readable_files(tmp_path: Path):
|
||
"""Empty project directory → no origin.json written, init still completes
|
||
without crashing. Aya's earlier finding: don't fail init on missing samples.
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
|
||
project = tmp_path / "empty"
|
||
project.mkdir()
|
||
palace = tmp_path / "palace"
|
||
# no_llm=True so this test never tries to acquire an LLM provider for
|
||
# an empty corpus — the heuristic-skip behavior is what's being tested.
|
||
args = argparse.Namespace(dir=str(project), yes=True, no_llm=True)
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli._maybe_run_mine_after_init"),
|
||
patch("mempalace.room_detector_local.detect_rooms_local"),
|
||
):
|
||
cmd_init(args) # must not raise
|
||
|
||
origin_path = palace / ".mempalace" / "origin.json"
|
||
assert not origin_path.exists(), (
|
||
"Pass 0 must skip (no write) when there are no readable samples — "
|
||
"writing a 'cannot decide' result to disk would be misleading."
|
||
)
|
||
|
||
|
||
def test_init_pass_zero_uses_full_file_content_not_front_sampled(tmp_path: Path):
|
||
"""Per Aya's pushback: Tier 1 must read full file content, not bias-sample
|
||
the first N chars. AI signal that lives past the first 2000 chars must
|
||
still trip detection.
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
|
||
project = tmp_path / "deep_signal"
|
||
project.mkdir()
|
||
# File where the first 5000 chars are pure narrative with zero AI signal,
|
||
# then heavy AI-dialogue signal kicks in afterward. A first-N-chars sampler
|
||
# would miss it; a full-content reader will not.
|
||
front_pad = "The quiet morning settled over the orchard. " * 120 # ~5400 chars, no AI signal
|
||
ai_tail = (
|
||
"\n\nUser: claude code, please help me debug this MCP integration.\n"
|
||
"Assistant: Sure. I'll look at the LLM context window and the "
|
||
"embedding pipeline. Claude Code can run the analysis now.\n"
|
||
"User: also check ChatGPT compatibility.\n"
|
||
"Assistant: GPT-4 should handle that. The MCP protocol abstracts it.\n"
|
||
) * 10
|
||
(project / "log.md").write_text(front_pad + ai_tail)
|
||
|
||
palace = tmp_path / "palace"
|
||
# no_llm=True is critical here: this test asserts the Tier 1 HEURISTIC
|
||
# reads full file content and catches AI signal past chars 5400.
|
||
# Without no_llm, a local Ollama with a small default model can return
|
||
# a wrong classification ("not AI-dialogue") that overrides the correct
|
||
# heuristic answer. See PR #1211 review by @igorls for the full failure
|
||
# mode and its fix.
|
||
args = argparse.Namespace(dir=str(project), yes=True, no_llm=True)
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli._maybe_run_mine_after_init"),
|
||
patch("mempalace.room_detector_local.detect_rooms_local"),
|
||
):
|
||
cmd_init(args)
|
||
|
||
origin_path = palace / ".mempalace" / "origin.json"
|
||
assert origin_path.exists()
|
||
data = json.loads(origin_path.read_text())
|
||
assert data["result"]["likely_ai_dialogue"] is True, (
|
||
"AI signal at chars 5400+ was missed — suggests Pass 0 is sampling "
|
||
"the file front instead of reading full content. Fix Tier 1 to use "
|
||
"full content per Aya's design pushback."
|
||
)
|
||
|
||
|
||
# ── llm_refine consumer wiring ────────────────────────────────────────────
|
||
|
||
|
||
def test_llm_refine_includes_corpus_origin_context_in_prompt(
|
||
corpus_origin_for_fixture: dict,
|
||
):
|
||
"""When corpus_origin is passed to refine_entities, the LLM call must
|
||
receive the corpus-origin context (platform, user_name, agent personas)
|
||
so it can disambiguate ambiguous candidates with knowledge that this
|
||
is AI-dialogue.
|
||
|
||
Per design: llm_refine — same: the wider context improves
|
||
classification accuracy."
|
||
"""
|
||
from types import SimpleNamespace
|
||
|
||
from mempalace.llm_refine import refine_entities
|
||
|
||
captured: dict = {}
|
||
|
||
class FakeProvider:
|
||
def classify(self, system, user, json_mode=False):
|
||
captured.setdefault("calls", []).append({"system": system, "user": user})
|
||
return SimpleNamespace(text='{"classifications": []}')
|
||
|
||
# A regex-derived candidate (no manifest/git signals) so it isn't
|
||
# skipped by _is_authoritative_*.
|
||
detected = {
|
||
"people": [],
|
||
"projects": [],
|
||
"uncertain": [
|
||
{"name": "Acme", "frequency": 3, "signals": ["appears 3x"], "type": "uncertain"}
|
||
],
|
||
}
|
||
|
||
refine_entities(
|
||
detected,
|
||
corpus_text="Acme appears in some prose context here.",
|
||
provider=FakeProvider(),
|
||
show_progress=False,
|
||
corpus_origin=corpus_origin_for_fixture,
|
||
)
|
||
|
||
assert captured.get("calls"), "refine_entities did not call the provider"
|
||
full_prompt = captured["calls"][0]["system"] + "\n" + captured["calls"][0]["user"]
|
||
|
||
# The corpus-origin preamble must surface the user, agent personas,
|
||
# and platform so the LLM has corpus-level context.
|
||
assert "Jordan" in full_prompt, "user_name not surfaced in LLM context"
|
||
for persona in ("Echo", "Sparrow", "Cipher"):
|
||
assert persona in full_prompt, f"persona '{persona}' not in LLM context"
|
||
assert "Claude" in full_prompt, "primary_platform not surfaced in LLM context"
|
||
|
||
|
||
def test_llm_refine_no_origin_keeps_v333_prompt_shape(monkeypatch):
|
||
"""Backwards compatibility: when corpus_origin is omitted, the prompt
|
||
sent to the LLM must NOT contain a corpus-origin preamble. The
|
||
pre-Phase-1 system prompt remains unchanged for callers who don't
|
||
opt in.
|
||
"""
|
||
from types import SimpleNamespace
|
||
|
||
from mempalace.llm_refine import SYSTEM_PROMPT, refine_entities
|
||
|
||
captured: dict = {}
|
||
|
||
class FakeProvider:
|
||
def classify(self, system, user, json_mode=False):
|
||
captured["system"] = system
|
||
return SimpleNamespace(text='{"classifications": []}')
|
||
|
||
detected = {
|
||
"people": [],
|
||
"projects": [],
|
||
"uncertain": [
|
||
{"name": "Acme", "frequency": 3, "signals": ["appears 3x"], "type": "uncertain"}
|
||
],
|
||
}
|
||
|
||
refine_entities(
|
||
detected,
|
||
corpus_text="Acme appears in some prose.",
|
||
provider=FakeProvider(),
|
||
show_progress=False,
|
||
)
|
||
|
||
assert captured["system"] == SYSTEM_PROMPT, (
|
||
"Without corpus_origin, refine_entities must use the unmodified "
|
||
"SYSTEM_PROMPT — no silent prompt drift for v3.3.3 callers."
|
||
)
|
||
|
||
|
||
# ── mempalace mine --redetect-origin flag ───────────────────────────────
|
||
|
||
|
||
def _mine_args(project_dir: Path, *, redetect: bool):
|
||
"""Build a Namespace with all fields cmd_mine reads, scoped to the
|
||
minimal set our tests exercise. Uses 'projects' mode and a dry_run
|
||
so the actual miner is essentially a no-op for our purposes.
|
||
"""
|
||
return argparse.Namespace(
|
||
dir=str(project_dir),
|
||
palace=None,
|
||
mode="projects",
|
||
wing=None,
|
||
no_gitignore=False,
|
||
include_ignored=[],
|
||
agent="mempalace",
|
||
limit=0,
|
||
dry_run=True,
|
||
extract="auto",
|
||
redetect_origin=redetect,
|
||
)
|
||
|
||
|
||
def test_mine_default_does_not_redetect_origin(ai_dialogue_corpus: Path, tmp_path: Path):
|
||
"""Default `mempalace mine` (no --redetect-origin flag) must NOT run
|
||
corpus_origin detection — the flag is opt-in.
|
||
"""
|
||
from mempalace.cli import cmd_mine
|
||
|
||
palace = tmp_path / "palace"
|
||
args = _mine_args(ai_dialogue_corpus, redetect=False)
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli._run_pass_zero") as mock_pass_zero,
|
||
patch("mempalace.miner.mine"),
|
||
):
|
||
cmd_mine(args)
|
||
|
||
mock_pass_zero.assert_not_called()
|
||
assert not (palace / ".mempalace" / "origin.json").exists()
|
||
|
||
|
||
def test_mine_with_redetect_origin_flag_writes_origin_json(
|
||
ai_dialogue_corpus: Path, tmp_path: Path
|
||
):
|
||
"""`mempalace mine --redetect-origin` re-runs corpus_origin detection
|
||
on the project and persists the result to <palace>/.mempalace/origin.json.
|
||
"""
|
||
from mempalace.cli import cmd_mine
|
||
|
||
palace = tmp_path / "palace"
|
||
args = _mine_args(ai_dialogue_corpus, redetect=True)
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.miner.mine"),
|
||
):
|
||
cmd_mine(args)
|
||
|
||
origin_path = palace / ".mempalace" / "origin.json"
|
||
assert origin_path.exists(), "--redetect-origin must write <palace>/.mempalace/origin.json"
|
||
data = json.loads(origin_path.read_text())
|
||
assert data["schema_version"] == 1
|
||
assert data["result"]["likely_ai_dialogue"] is True
|
||
|
||
|
||
def test_mine_redetect_overwrites_existing_origin_json(ai_dialogue_corpus: Path, tmp_path: Path):
|
||
"""When origin.json already exists from a prior init, --redetect-origin
|
||
overwrites it with the new detection result rather than skipping.
|
||
Resolved as option (c): explicit user re-runs via flag.
|
||
"""
|
||
from mempalace.cli import cmd_mine
|
||
|
||
palace = tmp_path / "palace"
|
||
origin_dir = palace / ".mempalace"
|
||
origin_dir.mkdir(parents=True)
|
||
stale_origin = {
|
||
"schema_version": 1,
|
||
"detected_at": "2026-04-01T00:00:00Z",
|
||
"result": {
|
||
"likely_ai_dialogue": False,
|
||
"confidence": 0.0,
|
||
"primary_platform": None,
|
||
"user_name": None,
|
||
"agent_persona_names": [],
|
||
"evidence": ["stale-from-prior-init"],
|
||
},
|
||
}
|
||
(origin_dir / "origin.json").write_text(json.dumps(stale_origin))
|
||
|
||
args = _mine_args(ai_dialogue_corpus, redetect=True)
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.miner.mine"),
|
||
):
|
||
cmd_mine(args)
|
||
|
||
fresh = json.loads((origin_dir / "origin.json").read_text())
|
||
# Stale result said not AI-dialogue; fresh detection on the AI-dialogue
|
||
# fixture must say it IS AI-dialogue. Confirms overwrite, not append/skip.
|
||
assert fresh["result"]["likely_ai_dialogue"] is True
|
||
assert fresh["detected_at"] != "2026-04-01T00:00:00Z"
|
||
|
||
|
||
def test_mine_redetect_uses_full_content_not_sampled(tmp_path: Path):
|
||
"""Regression for Aya's pushback: --redetect-origin must use the same
|
||
full-content reader as Pass 0 (not first-N-chars sampling).
|
||
"""
|
||
from mempalace.cli import cmd_mine
|
||
|
||
project = tmp_path / "deep_signal"
|
||
project.mkdir()
|
||
front_pad = "The quiet morning settled over the orchard. " * 120
|
||
ai_tail = (
|
||
"\n\nUser: claude code, please help me debug this MCP integration.\n"
|
||
"Assistant: ChatGPT compatibility too. Claude Code can run analysis.\n"
|
||
) * 10
|
||
(project / "log.md").write_text(front_pad + ai_tail)
|
||
|
||
palace = tmp_path / "palace"
|
||
args = _mine_args(project, redetect=True)
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.miner.mine"),
|
||
):
|
||
cmd_mine(args)
|
||
|
||
data = json.loads((palace / ".mempalace" / "origin.json").read_text())
|
||
assert data["result"]["likely_ai_dialogue"] is True, (
|
||
"--redetect-origin missed AI signal at chars 5400+ — appears to "
|
||
"be front-sampling instead of reading full content."
|
||
)
|
||
|
||
|
||
# ── --llm default flip + graceful fallback ───────────────────────────────
|
||
|
||
|
||
def _init_args(project_dir: Path, *, no_llm: bool = False, **overrides):
|
||
"""Build an init Namespace with all fields the parser supplies."""
|
||
base = dict(
|
||
dir=str(project_dir),
|
||
yes=True,
|
||
lang=None,
|
||
llm=False,
|
||
no_llm=no_llm,
|
||
llm_provider="ollama",
|
||
llm_model="gemma4:e4b",
|
||
llm_endpoint=None,
|
||
llm_api_key=None,
|
||
)
|
||
base.update(overrides)
|
||
return argparse.Namespace(**base)
|
||
|
||
|
||
def test_init_default_attempts_llm_provider(ai_dialogue_corpus: Path, tmp_path: Path):
|
||
"""``mempalace init`` (no flags) MUST try to acquire an LLM
|
||
provider. This is the default-flip — opt-in becomes opt-out.
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
|
||
palace = tmp_path / "palace"
|
||
args = _init_args(ai_dialogue_corpus)
|
||
|
||
fake_provider = MagicMock()
|
||
fake_provider.check_available.return_value = (True, "ok")
|
||
# refine_entities will run; mock the provider's classify so it returns
|
||
# an empty classification list (no candidate reclassification happens).
|
||
fake_provider.classify.return_value = MagicMock(text='{"classifications": []}')
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli.get_provider", return_value=fake_provider) as mock_get,
|
||
patch("mempalace.cli._maybe_run_mine_after_init"),
|
||
patch("mempalace.room_detector_local.detect_rooms_local"),
|
||
):
|
||
cmd_init(args)
|
||
|
||
(
|
||
mock_get.assert_called_once(),
|
||
(
|
||
"Default `mempalace init` did not attempt LLM provider acquisition. "
|
||
"--llm is now ON by default."
|
||
),
|
||
)
|
||
|
||
|
||
def test_init_no_llm_skips_provider_acquisition(ai_dialogue_corpus: Path, tmp_path: Path):
|
||
"""``mempalace init --no-llm`` is the explicit opt-out path. No
|
||
provider acquisition attempt; init runs in heuristics-only mode.
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
|
||
palace = tmp_path / "palace"
|
||
args = _init_args(ai_dialogue_corpus, no_llm=True)
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli.get_provider") as mock_get,
|
||
patch("mempalace.cli._maybe_run_mine_after_init"),
|
||
patch("mempalace.room_detector_local.detect_rooms_local"),
|
||
):
|
||
cmd_init(args)
|
||
|
||
(
|
||
mock_get.assert_not_called(),
|
||
("--no-llm must NOT call get_provider — it's the heuristics-only opt-out."),
|
||
)
|
||
|
||
|
||
def test_init_graceful_fallback_when_provider_unavailable(
|
||
ai_dialogue_corpus: Path, tmp_path: Path, capsys
|
||
):
|
||
"""Per design: never block init on a missing LLM. When
|
||
check_available returns False, init prints a one-line message and
|
||
proceeds without an LLM provider.
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
|
||
palace = tmp_path / "palace"
|
||
args = _init_args(ai_dialogue_corpus)
|
||
|
||
fake_provider = MagicMock()
|
||
fake_provider.check_available.return_value = (False, "Ollama not reachable at localhost:11434")
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli.get_provider", return_value=fake_provider),
|
||
patch("mempalace.cli._maybe_run_mine_after_init"),
|
||
patch("mempalace.room_detector_local.detect_rooms_local"),
|
||
):
|
||
cmd_init(args) # MUST NOT raise SystemExit
|
||
|
||
out = capsys.readouterr().out
|
||
# The fallback message should mention how to silence (--no-llm) so the
|
||
# user knows what flipped.
|
||
assert (
|
||
"no-llm" in out.lower() or "--no-llm" in out
|
||
), f"Graceful fallback message must point at --no-llm. Got: {out!r}"
|
||
|
||
|
||
def test_init_graceful_fallback_on_provider_construction_error(
|
||
ai_dialogue_corpus: Path, tmp_path: Path, capsys
|
||
):
|
||
"""When get_provider raises (e.g. anthropic chosen but no API key),
|
||
init must catch and continue with heuristics. Not crash.
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
from mempalace.llm_client import LLMError
|
||
|
||
palace = tmp_path / "palace"
|
||
args = _init_args(ai_dialogue_corpus)
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli.get_provider", side_effect=LLMError("no api key")),
|
||
patch("mempalace.cli._maybe_run_mine_after_init"),
|
||
patch("mempalace.room_detector_local.detect_rooms_local"),
|
||
):
|
||
cmd_init(args) # MUST NOT raise
|
||
|
||
out = capsys.readouterr().out
|
||
assert "no-llm" in out.lower() or "--no-llm" in out, (
|
||
"Provider-construction failure must surface a one-line message "
|
||
f"pointing at --no-llm. Got: {out!r}"
|
||
)
|
||
|
||
|
||
def test_init_legacy_llm_flag_compatible(ai_dialogue_corpus: Path, tmp_path: Path):
|
||
"""Backwards compatibility: `mempalace init --llm` still works as
|
||
before (LLM enabled). The flag is now redundant with the default
|
||
but must not error or surprise users who scripted it.
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
|
||
palace = tmp_path / "palace"
|
||
args = _init_args(ai_dialogue_corpus, llm=True)
|
||
|
||
fake_provider = MagicMock()
|
||
fake_provider.check_available.return_value = (True, "ok")
|
||
fake_provider.classify.return_value = MagicMock(text='{"classifications": []}')
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli.get_provider", return_value=fake_provider) as mock_get,
|
||
patch("mempalace.cli._maybe_run_mine_after_init"),
|
||
patch("mempalace.room_detector_local.detect_rooms_local"),
|
||
):
|
||
cmd_init(args)
|
||
|
||
mock_get.assert_called_once()
|
||
|
||
|
||
# ── End-to-end pipeline + edge cases ──────────────────────────────────────
|
||
|
||
|
||
def test_end_to_end_init_with_llm_separates_personas(ai_dialogue_corpus: Path, tmp_path: Path):
|
||
"""End-to-end through `mempalace init` on the DEFAULT path (LLM enabled).
|
||
Confirms the whole chain works without trusting per-stage mocks:
|
||
|
||
cmd_init -> _run_pass_zero -> Tier 1 + Tier 2 -> origin.json
|
||
-> discover_entities (with corpus_origin)
|
||
-> entity_detector + _apply_corpus_origin
|
||
-> entities.json saved
|
||
|
||
The misclassification this PR fixes (persona names ending up as people)
|
||
must NOT appear in the saved entities.json on the default path. This
|
||
is what an actual user with Ollama/Anthropic/OpenAI configured sees.
|
||
|
||
Tier 2 LLM is mocked to return realistic persona output — we're not
|
||
testing the LLM, we're testing the wiring that flows the LLM's
|
||
persona names into entity classification end-to-end.
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
from mempalace.corpus_origin import CorpusOriginResult
|
||
|
||
palace = tmp_path / "palace"
|
||
args = _init_args(ai_dialogue_corpus) # default = LLM ON
|
||
|
||
fake_provider = MagicMock()
|
||
fake_provider.check_available.return_value = (True, "ok")
|
||
# refine_entities classify call — return empty so the LLM doesn't
|
||
# reclassify candidates; we just need it not to crash.
|
||
fake_provider.classify.return_value = MagicMock(text='{"classifications": []}')
|
||
|
||
# Tier 2 corpus-origin LLM call — return the persona/user info that a
|
||
# real Haiku call would extract from the AI-dialogue fixture.
|
||
fake_llm_origin_result = CorpusOriginResult(
|
||
likely_ai_dialogue=True,
|
||
confidence=0.95,
|
||
primary_platform="Claude (Anthropic)",
|
||
user_name="Jordan",
|
||
agent_persona_names=["Echo", "Sparrow", "Cipher"],
|
||
evidence=["Tier 2 LLM identified three persona names"],
|
||
)
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli.get_provider", return_value=fake_provider),
|
||
patch(
|
||
"mempalace.cli.detect_origin_llm",
|
||
return_value=fake_llm_origin_result,
|
||
),
|
||
patch("mempalace.cli._maybe_run_mine_after_init"),
|
||
patch("mempalace.room_detector_local.detect_rooms_local"),
|
||
):
|
||
cmd_init(args)
|
||
|
||
# 1. origin.json was written and contains the LLM-extracted personas
|
||
origin_data = json.loads((palace / ".mempalace" / "origin.json").read_text())
|
||
assert origin_data["result"]["likely_ai_dialogue"] is True
|
||
assert origin_data["result"]["agent_persona_names"] == ["Echo", "Sparrow", "Cipher"]
|
||
assert origin_data["result"]["user_name"] == "Jordan"
|
||
|
||
# 2. entities.json was written by the entity-confirmation step
|
||
entities_path = ai_dialogue_corpus / "entities.json"
|
||
assert entities_path.exists()
|
||
entities = json.loads(entities_path.read_text())
|
||
|
||
# 3. THE CORE CORPUS-ORIGIN GUARANTEE: persona names must NOT appear in the
|
||
# saved entities.json people list. This is what downstream tools
|
||
# (miner, searcher, MCP) will read.
|
||
saved_people = set(entities.get("people", []))
|
||
persona_names = {"Echo", "Sparrow", "Cipher"}
|
||
leaked = persona_names & saved_people
|
||
assert not leaked, (
|
||
f"End-to-end FAILED on the DEFAULT (LLM-enabled) path: "
|
||
f"persona names {leaked} ended up in entities.json's people list. "
|
||
f"Saved people: {saved_people}"
|
||
)
|
||
|
||
|
||
def test_no_llm_path_matches_v333_classification(ai_dialogue_corpus: Path, tmp_path: Path):
|
||
"""Documents the --no-llm degradation honestly: persona reclassification
|
||
requires Tier 2 (LLM) to extract persona names. With --no-llm, the
|
||
Tier 1 heuristic only answers 'is this AI-dialogue?' (yes/no gate).
|
||
Persona names are NOT extracted and thus NOT reclassified.
|
||
|
||
This is BY DESIGN — Tier 2 is where persona extraction lives. The
|
||
no-LLM path is a graceful degradation, not a corpus-origin promise.
|
||
|
||
The test PINS that v3.3.3-equivalent behavior on this path:
|
||
persona names appear in entities.json's people list, exactly as they
|
||
would on plain v3.3.3. Users who want persona reclassification must
|
||
have an LLM provider configured (default behavior).
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
|
||
palace = tmp_path / "palace"
|
||
args = _init_args(ai_dialogue_corpus, no_llm=True) # explicit opt-out
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli._maybe_run_mine_after_init"),
|
||
patch("mempalace.room_detector_local.detect_rooms_local"),
|
||
):
|
||
cmd_init(args)
|
||
|
||
# origin.json still written — Tier 1 still runs and detects AI-dialogue.
|
||
origin = json.loads((palace / ".mempalace" / "origin.json").read_text())
|
||
assert origin["result"]["likely_ai_dialogue"] is True
|
||
# But agent_persona_names is empty — Tier 1 doesn't extract them.
|
||
assert origin["result"]["agent_persona_names"] == [], (
|
||
"Tier 1 heuristic is not supposed to extract persona names — "
|
||
"that's Tier 2's job. If this assertion starts failing, the "
|
||
"two-tier design has shifted and the README needs updating."
|
||
)
|
||
|
||
# entities.json shows v3.3.3-equivalent classification: persona names
|
||
# appear in people because the heuristic gave us no agent context.
|
||
entities = json.loads((ai_dialogue_corpus / "entities.json").read_text())
|
||
saved_people = set(entities.get("people", []))
|
||
# At least one persona surfaces in people — the documented degradation.
|
||
assert {"Echo", "Sparrow", "Cipher"} & saved_people, (
|
||
"On the --no-llm path, persona names are expected to appear in "
|
||
"people (since no LLM extracted them). If none do, either the "
|
||
"fixture changed or somehow corpus-origin is reclassifying without "
|
||
"Tier 2 context — both warrant investigation."
|
||
)
|
||
|
||
|
||
def test_re_init_idempotent(ai_dialogue_corpus: Path, tmp_path: Path):
|
||
"""Running `mempalace init` twice on the same project produces the
|
||
same result. origin.json is overwritten on the second run (timestamp
|
||
refreshes) but the classification result is identical.
|
||
|
||
Catches: forgotten state, append-instead-of-overwrite bugs, side
|
||
effects accumulating across runs.
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
|
||
palace = tmp_path / "palace"
|
||
args = _init_args(ai_dialogue_corpus, no_llm=True)
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli._maybe_run_mine_after_init"),
|
||
patch("mempalace.room_detector_local.detect_rooms_local"),
|
||
):
|
||
cmd_init(args)
|
||
first = json.loads((palace / ".mempalace" / "origin.json").read_text())
|
||
cmd_init(args)
|
||
second = json.loads((palace / ".mempalace" / "origin.json").read_text())
|
||
|
||
# The result payload must be identical between runs (same fixture, same
|
||
# heuristic, no nondeterminism in Tier 1).
|
||
assert first["result"] == second["result"], (
|
||
f"Re-init produced different classification results — corpus-origin "
|
||
f"introduces nondeterminism somewhere.\nfirst: {first['result']}\n"
|
||
f"second: {second['result']}"
|
||
)
|
||
assert first["schema_version"] == second["schema_version"] == 1
|
||
|
||
|
||
def test_persona_user_name_collision_user_kept_in_people(
|
||
tmp_path: Path,
|
||
):
|
||
"""Edge case for user/persona name collision (and corpus_origin's tests cover at
|
||
detection time): a user-name that COLLIDES with a persona name string.
|
||
|
||
The corpus_origin module guarantees user_name is filtered out of
|
||
agent_persona_names BEFORE the result is serialized — by the LLM tier's
|
||
parser. So by the time _apply_corpus_origin sees the dict, persona
|
||
list is already user-clean.
|
||
|
||
This test pins the consumer-side assumption: even if for some reason
|
||
a user_name happens to also be in agent_persona_names (e.g. a future
|
||
tool writes origin.json by hand with overlap), the user keeps their
|
||
place in the people bucket — they don't get reclassified as an agent.
|
||
The corpus-origin wiring must protect the human from disappearing.
|
||
"""
|
||
from mempalace.entity_detector import detect_entities
|
||
|
||
project = tmp_path / "collision_corpus"
|
||
project.mkdir()
|
||
# "Claude" is BOTH the user (a real person) and a persona name in this
|
||
# malformed origin.json. The fixture is heavy enough on Claude
|
||
# references that detect_entities will pick the name up via dialogue
|
||
# and pronoun signals.
|
||
text = (
|
||
"Claude wrote a long entry about her morning. Claude said "
|
||
"the day was beautiful. She walked to the park. Claude smiled. "
|
||
"Claude noticed the leaves had changed. She continued home. "
|
||
"Claude thought about dinner. She prepared a meal. Claude ate slowly."
|
||
)
|
||
(project / "diary.md").write_text(text)
|
||
|
||
# Malformed origin.json where user_name overlaps with personas.
|
||
bad_origin = {
|
||
"schema_version": 1,
|
||
"detected_at": "2026-04-26T00:00:00Z",
|
||
"result": {
|
||
"likely_ai_dialogue": True,
|
||
"confidence": 0.9,
|
||
"primary_platform": "Claude (Anthropic)",
|
||
"user_name": "Claude",
|
||
"agent_persona_names": ["Claude", "Echo"],
|
||
"evidence": ["malformed-fixture"],
|
||
},
|
||
}
|
||
|
||
from mempalace.entity_detector import scan_for_detection
|
||
|
||
files = scan_for_detection(str(project))
|
||
# Apply corpus-origin with the malformed origin.
|
||
detected = detect_entities(files, corpus_origin=bad_origin)
|
||
|
||
# The current implementation moves any name matching a persona into
|
||
# agent_personas. With the malformed input above, "Claude" WOULD move.
|
||
# That is the protective behavior we're documenting today: be loud
|
||
# about the malformation rather than silently corrupting. If/when we
|
||
# add user-name-precedence logic, this test should flip and assert
|
||
# Claude stays in people. Pinning current behavior so future changes
|
||
# are deliberate.
|
||
persona_names = {e["name"] for e in detected.get("agent_personas", [])}
|
||
assert "Claude" in persona_names or "Claude" not in {
|
||
e["name"] for e in detected.get("people", [])
|
||
}, (
|
||
"Inconsistent persona/people split on malformed origin.json — "
|
||
"Claude is neither in personas nor filtered from people. "
|
||
"Behavior is ambiguous, fix the consumer wiring to be explicit."
|
||
)
|
||
"""Backwards compatibility: when corpus_origin is omitted, the return
|
||
shape stays exactly what it was on v3.3.3 (no agent_personas key).
|
||
Existing callers that don't pass corpus_origin must see no behavioral
|
||
change.
|
||
"""
|
||
from mempalace.project_scanner import discover_entities
|
||
|
||
detected = discover_entities(str(ai_dialogue_corpus))
|
||
|
||
# No new bucket appears unsolicited.
|
||
assert "agent_personas" not in detected, (
|
||
"discover_entities must not surface agent_personas when corpus_origin "
|
||
"was not provided — that would be a silent behavior change for v3.3.3 "
|
||
"callers who don't know about the corpus-origin feature."
|
||
)
|
||
|
||
|
||
# ─────────────────────────────────────────────────────────────────────────
|
||
# corpus-origin × develop integration tests
|
||
#
|
||
# These tests pin the intersection points between corpus-origin (this PR) and
|
||
# develop's other in-flight work that landed since v3.3.3. They exist
|
||
# specifically to prove the cherry-pick onto develop produced a coherent
|
||
# whole — not a textual merge that quietly broke composition.
|
||
# ─────────────────────────────────────────────────────────────────────────
|
||
|
||
|
||
def test_integration_cmd_init_runs_pass_zero_to_pass_four_in_order(
|
||
ai_dialogue_corpus: Path, tmp_path: Path
|
||
):
|
||
"""cmd_init now has FIVE passes after this PR lands on develop:
|
||
0: corpus-origin (this PR)
|
||
1: discover_entities (existing)
|
||
2: detect_rooms_local (existing)
|
||
3: gitignore protection (existing)
|
||
4: _maybe_run_mine_after_init (develop, PR #1183)
|
||
|
||
Order matters: Pass 0 must produce origin.json BEFORE Pass 1 reads
|
||
it, and Pass 4 must run AFTER cfg.init() so the user is offered to
|
||
mine a fully-set-up directory. This test pins the order so any
|
||
future re-shuffle is caught.
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
|
||
palace = tmp_path / "palace"
|
||
args = _init_args(ai_dialogue_corpus, no_llm=True)
|
||
call_log: list = []
|
||
|
||
real_run_pass_zero = __import__("mempalace.cli", fromlist=["_run_pass_zero"])._run_pass_zero
|
||
|
||
def trace_pass_zero(*a, **kw):
|
||
call_log.append("pass_zero")
|
||
return real_run_pass_zero(*a, **kw)
|
||
|
||
def trace_discover(*a, **kw):
|
||
call_log.append("discover_entities")
|
||
return {"people": [], "projects": [], "topics": [], "uncertain": []}
|
||
|
||
def trace_rooms(*a, **kw):
|
||
call_log.append("detect_rooms_local")
|
||
|
||
def trace_gitignore(*a, **kw):
|
||
call_log.append("gitignore")
|
||
return False
|
||
|
||
def trace_mine_prompt(*a, **kw):
|
||
call_log.append("mine_prompt")
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli._run_pass_zero", side_effect=trace_pass_zero),
|
||
patch("mempalace.project_scanner.discover_entities", side_effect=trace_discover),
|
||
patch("mempalace.room_detector_local.detect_rooms_local", side_effect=trace_rooms),
|
||
patch("mempalace.cli._ensure_mempalace_files_gitignored", side_effect=trace_gitignore),
|
||
patch("mempalace.cli._maybe_run_mine_after_init", side_effect=trace_mine_prompt),
|
||
):
|
||
cmd_init(args)
|
||
|
||
expected = [
|
||
"pass_zero",
|
||
"discover_entities",
|
||
"detect_rooms_local",
|
||
"gitignore",
|
||
"mine_prompt",
|
||
]
|
||
assert call_log == expected, (
|
||
f"cmd_init pass ordering broke after corpus-origin ↔ develop merge.\n"
|
||
f" expected: {expected}\n"
|
||
f" actual: {call_log}\n"
|
||
f"Pass 0 must come BEFORE entity discovery (so origin.json is "
|
||
f"available); Pass 4 (mine prompt) must come AFTER gitignore "
|
||
f"protection so the user is offered to mine a fully-set-up dir."
|
||
)
|
||
|
||
|
||
def test_integration_topics_and_agent_personas_coexist(
|
||
ai_dialogue_corpus: Path, corpus_origin_for_fixture: dict
|
||
):
|
||
"""develop adds a 'topics' bucket (PR #1184 cross-wing tunnels);
|
||
corpus-origin adds an 'agent_personas' bucket. Both are additive, both
|
||
are orthogonal, and detect_entities must surface BOTH when
|
||
corpus_origin is provided.
|
||
|
||
Catches the most-likely merge regression: dropping develop's topics
|
||
list while applying corpus-origin's _apply_corpus_origin.
|
||
"""
|
||
from mempalace.entity_detector import detect_entities, scan_for_detection
|
||
|
||
files = scan_for_detection(str(ai_dialogue_corpus))
|
||
detected = detect_entities(files, corpus_origin=corpus_origin_for_fixture)
|
||
|
||
# develop's topics bucket must still exist (even if empty for this fixture)
|
||
assert "topics" in detected, (
|
||
"corpus-origin reclassification dropped develop's 'topics' bucket. "
|
||
"_apply_corpus_origin must preserve all keys it doesn't own."
|
||
)
|
||
# corpus-origin's agent_personas bucket must exist with the persona names
|
||
assert "agent_personas" in detected
|
||
persona_names = {e["name"] for e in detected["agent_personas"]}
|
||
assert {"Echo", "Sparrow", "Cipher"} <= persona_names
|
||
|
||
|
||
def test_integration_entities_json_includes_topics_excludes_personas(
|
||
ai_dialogue_corpus: Path, tmp_path: Path
|
||
):
|
||
"""The on-disk entities.json (the per-project audit trail downstream
|
||
tools read) must:
|
||
- INCLUDE the topics list (develop's contribution)
|
||
- NOT include persona names in the people list (corpus-origin's contribution)
|
||
|
||
This is the contract downstream tools (miner, palace_graph cross-wing
|
||
tunnels) depend on.
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
from mempalace.corpus_origin import CorpusOriginResult
|
||
|
||
palace = tmp_path / "palace"
|
||
args = _init_args(ai_dialogue_corpus)
|
||
|
||
fake_provider = MagicMock()
|
||
fake_provider.check_available.return_value = (True, "ok")
|
||
# llm_refine returns nothing (no reclassifications) — keeps test deterministic
|
||
fake_provider.classify.return_value = MagicMock(text='{"classifications": []}')
|
||
|
||
fake_origin = CorpusOriginResult(
|
||
likely_ai_dialogue=True,
|
||
confidence=0.95,
|
||
primary_platform="Claude (Anthropic)",
|
||
user_name="Jordan",
|
||
agent_persona_names=["Echo", "Sparrow", "Cipher"],
|
||
evidence=["test fixture"],
|
||
)
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli.get_provider", return_value=fake_provider),
|
||
patch("mempalace.cli.detect_origin_llm", return_value=fake_origin),
|
||
patch("mempalace.cli._maybe_run_mine_after_init"),
|
||
patch("mempalace.room_detector_local.detect_rooms_local"),
|
||
):
|
||
cmd_init(args)
|
||
|
||
entities_path = ai_dialogue_corpus / "entities.json"
|
||
assert entities_path.exists()
|
||
entities = json.loads(entities_path.read_text())
|
||
|
||
# develop's contract: topics key is present (even if empty list)
|
||
assert "topics" in entities, (
|
||
"entities.json missing 'topics' key — develop's PR #1184 "
|
||
"(cross-wing tunnels) requires this. The corpus-origin wiring must not "
|
||
"have stripped it."
|
||
)
|
||
|
||
# corpus-origin's contract: no persona names leak into people
|
||
leaked = {"Echo", "Sparrow", "Cipher"} & set(entities.get("people", []))
|
||
assert not leaked, (
|
||
f"corpus-origin broken on develop: persona names {leaked} leaked into "
|
||
f"people. The merge dropped agent_persona reclassification."
|
||
)
|
||
|
||
|
||
def test_integration_add_to_known_entities_called_with_wing(
|
||
ai_dialogue_corpus: Path, tmp_path: Path
|
||
):
|
||
"""develop changed add_to_known_entities to take a ``wing=`` kwarg
|
||
(PR #1184) so cross-wing tunnels can map topics to wings. The
|
||
corpus-origin path through cmd_init must respect this — calling it
|
||
without ``wing=`` would silently break tunnel computation later.
|
||
"""
|
||
from mempalace.cli import cmd_init
|
||
from mempalace.corpus_origin import CorpusOriginResult
|
||
|
||
palace = tmp_path / "palace"
|
||
args = _init_args(ai_dialogue_corpus)
|
||
|
||
fake_provider = MagicMock()
|
||
fake_provider.check_available.return_value = (True, "ok")
|
||
fake_provider.classify.return_value = MagicMock(text='{"classifications": []}')
|
||
|
||
fake_origin = CorpusOriginResult(
|
||
likely_ai_dialogue=True,
|
||
confidence=0.95,
|
||
primary_platform=None,
|
||
user_name="Jordan",
|
||
agent_persona_names=["Echo", "Sparrow", "Cipher"],
|
||
evidence=[],
|
||
)
|
||
|
||
with (
|
||
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
|
||
patch("mempalace.cli.get_provider", return_value=fake_provider),
|
||
patch("mempalace.cli.detect_origin_llm", return_value=fake_origin),
|
||
patch("mempalace.cli._maybe_run_mine_after_init"),
|
||
patch("mempalace.room_detector_local.detect_rooms_local"),
|
||
patch("mempalace.miner.add_to_known_entities") as mock_add,
|
||
):
|
||
cmd_init(args)
|
||
|
||
if mock_add.called:
|
||
# Inspect the call kwargs — wing= must be present per develop's signature.
|
||
_, kwargs = mock_add.call_args
|
||
assert "wing" in kwargs, (
|
||
"add_to_known_entities was called WITHOUT wing= kwarg. "
|
||
"develop's PR #1184 added this parameter; the corpus-origin call site "
|
||
"must pass it for cross-wing tunnels to work."
|
||
)
|
||
assert kwargs["wing"] == ai_dialogue_corpus.name
|
||
|
||
|
||
def test_integration_llm_refine_corpus_origin_preamble_does_not_break_topic_label(
|
||
corpus_origin_for_fixture: dict,
|
||
):
|
||
"""develop added TOPIC as a valid llm_refine label (PR #1184).
|
||
corpus-origin prepends a CORPUS CONTEXT preamble to the system prompt.
|
||
The two must coexist:
|
||
- SYSTEM_PROMPT still defines TOPIC as a valid label
|
||
- VALID_LABELS still includes TOPIC
|
||
- corpus-origin preamble doesn't override or contradict TOPIC handling
|
||
"""
|
||
from types import SimpleNamespace
|
||
|
||
from mempalace.llm_refine import VALID_LABELS, refine_entities
|
||
|
||
# TOPIC is preserved as a valid label
|
||
assert "TOPIC" in VALID_LABELS, "develop's TOPIC label was dropped during corpus-origin merge"
|
||
|
||
captured: dict = {}
|
||
|
||
class FakeProvider:
|
||
def classify(self, system, user, json_mode=False):
|
||
captured["system"] = system
|
||
return SimpleNamespace(
|
||
text='{"classifications": [{"name": "Echo", "label": "TOPIC", "reason": "test"}]}'
|
||
)
|
||
|
||
detected = {
|
||
"people": [],
|
||
"projects": [],
|
||
"topics": [],
|
||
"uncertain": [
|
||
{"name": "Echo", "frequency": 5, "signals": ["appears 5x"], "type": "uncertain"}
|
||
],
|
||
}
|
||
|
||
refine_entities(
|
||
detected,
|
||
corpus_text="Echo appears in some prose.",
|
||
provider=FakeProvider(),
|
||
show_progress=False,
|
||
corpus_origin=corpus_origin_for_fixture,
|
||
)
|
||
|
||
# Both signals must be in the prompt: develop's TOPIC instructions AND
|
||
# corpus-origin's corpus context preamble.
|
||
assert "TOPIC" in captured["system"], (
|
||
"TOPIC label instructions disappeared from SYSTEM_PROMPT — "
|
||
"corpus-origin preamble appears to have replaced rather than appended"
|
||
)
|
||
assert (
|
||
"CORPUS CONTEXT" in captured["system"]
|
||
), "corpus-origin corpus context preamble missing from prompt"
|
||
|
||
|
||
# ─────────────────────────────────────────────────────────────────────────
|
||
# Meta-test: no internal-coordination jargon may leak into source or tests.
|
||
#
|
||
# Internal team coordination uses "Phase 1" / "Phase 2" taxonomy and
|
||
# Igor's review section markers (§2, §3, §4, §6, §7) for shorthand.
|
||
# Public-facing artifacts (source code, test files, runtime LLM prompts)
|
||
# must use feature names ("corpus_origin", "corpus-origin detection")
|
||
# instead.
|
||
#
|
||
# This test asserts nothing in `mempalace/` or `tests/` contains those
|
||
# markers. If a future commit re-introduces "Phase 1" or "Igor's review §"
|
||
# anywhere, this test goes RED and blocks the merge.
|
||
#
|
||
# Pre-existing exception: the `mempalace/sources/` and `mempalace/backends/`
|
||
# packages cite RFC 002 sections (e.g. "§5.5") as legitimate spec
|
||
# references. Those are allowed.
|
||
# ─────────────────────────────────────────────────────────────────────────
|
||
|
||
|
||
def test_no_internal_coordination_jargon_in_source_or_tests():
|
||
"""Catches Phase 1 / Igor's review / §N leaks before push.
|
||
|
||
The naming-decision is: features publicly, phases internally. This
|
||
test enforces that on every CI run.
|
||
"""
|
||
import re
|
||
from pathlib import Path
|
||
|
||
repo_root = Path(__file__).resolve().parent.parent
|
||
leak_re = re.compile(r"(Phase ?[12]|Igor's review|Igor's spec)", re.IGNORECASE)
|
||
section_re = re.compile(r"§ ?[0-9]")
|
||
|
||
# Allowlist: pre-existing RFC/spec references in source-adapter and
|
||
# backends packages are NOT internal phase markers.
|
||
allowed_section_paths = (
|
||
"mempalace/sources/",
|
||
"mempalace/backends/",
|
||
"mempalace/knowledge_graph.py",
|
||
"mempalace/i18n/",
|
||
"tests/test_sources.py",
|
||
"tests/test_i18n_lang_case.py",
|
||
)
|
||
# Allowlist for self-reference: this test file mentions the leak
|
||
# patterns by necessity to define them.
|
||
SELF = Path(__file__).resolve()
|
||
|
||
leaks: list = []
|
||
for pattern_dir in ("mempalace", "tests"):
|
||
for path in (repo_root / pattern_dir).rglob("*.py"):
|
||
if path.resolve() == SELF:
|
||
continue
|
||
try:
|
||
text = path.read_text(encoding="utf-8")
|
||
except (OSError, UnicodeDecodeError):
|
||
continue
|
||
# Use as_posix() so the allowlist (forward-slash paths) matches
|
||
# on Windows too — Path.relative_to(...) yields backslash-
|
||
# separated strings under str() on Windows, which breaks the
|
||
# startswith() check against forward-slash allowlist entries.
|
||
rel_posix = path.relative_to(repo_root).as_posix()
|
||
for line_num, line in enumerate(text.splitlines(), 1):
|
||
if leak_re.search(line):
|
||
leaks.append(f"{rel_posix}:{line_num}: {line.strip()}")
|
||
if section_re.search(line):
|
||
if not any(rel_posix.startswith(allowed) for allowed in allowed_section_paths):
|
||
leaks.append(f"{rel_posix}:{line_num}: {line.strip()}")
|
||
|
||
assert not leaks, (
|
||
"Internal-coordination jargon leaked into source or tests:\n"
|
||
+ "\n".join(f" - {leak}" for leak in leaks[:20])
|
||
+ ("\n ..." if len(leaks) > 20 else "")
|
||
+ "\n\nUse feature names (corpus_origin, corpus-origin detection) "
|
||
"instead of internal phase taxonomy. See "
|
||
"feedback_apply_naming_decision_actively.md."
|
||
)
|