feat(init): context-aware corpus detection

10 files changed. 2,563 insertions, 30 deletions. 48 new tests, including end-to-end coverage live-tested with Anthropic Haiku 4.5. This PR overhauls the first-run experience of `mempalace init` end-to-end, ships a new corpus-origin detection module from scratch, wires it into entity classification and LLM refinement, adds a graceful-fallback path that means `init` never crashes on a missing LLM, and ships a meta-test that prevents internal-coordination jargon from leaking into source or tests. The headline change is that `mempalace init` now understands what kind of folder you're pointing it at — AI conversations, regular writing, code, narrative — and adapts how it classifies entities accordingly. The same folder containing `Echo`, `Sparrow`, and `Cipher` (names you've assigned to AI agents) used to dump those into your "people" list alongside biological humans. Now they go into a separate `agent_personas` bucket, and your `people` list stays clean. But the broader change is that `mempalace init` got upgraded across the board — smarter defaults, smarter degradation, smarter classification, smarter persistence, and a new way to refresh as your folder grows. Built and live-verified with Anthropic Haiku 4.5; runs unmodified on the local LLM runtimes mempalace already supports. ## What changes for users (in order, from `pip install` onwards) **Install** — `pip install mempalace` is unchanged. The package itself didn't shift. **First run — `mempalace init <folder>`:** 1. **`init` examines your folder before classifying anything.** A free regex heuristic decides in milliseconds: AI conversations, regular writing, narrative, or code? If an LLM is reachable, a second pass extracts the corpus author's name and any agent persona names from the dialogue. v3.3.3 had no such step — it dove straight into entity detection with no corpus context. 2. **LLM-assisted classification is now ON by default.** v3.3.3 made `--llm` opt-in. The LLM-assisted path is qualitatively better (extracts persona names, refines ambiguous classifications, gives the model corpus context) so it now runs by default. The provider abstraction is unchanged from v3.3.3 — three buckets are supported by `mempalace.llm_client`: - **Anthropic** (`--llm-provider anthropic` + `ANTHROPIC_API_KEY`) — the official Messages API. **This is the path live-verified end-to-end in this PR with Haiku 4.5.** Cost: ~\$0.01 per `init`. - **Ollama** (`--llm-provider ollama` — the default) — local models via `http://localhost:11434`. Fully offline. Honors the "zero-API required" promise. - **OpenAI-compatible** (`--llm-provider openai-compat` + `--llm-endpoint`) — per the v3.3.3 `mempalace/llm_client.py` docstring, this covers "OpenRouter, LM Studio, llama.cpp server, vLLM, Groq, Fireworks, Together, and most self-hosted setups." We did not test each of those individually as part of this PR; the abstraction has been stable since v3.3.3. If you try this PR with a specific provider and hit a quirk, please file an issue or comment here. 3. **`init` never blocks on a missing LLM.** No Ollama running, no API key set? `init` prints a one-line message pointing at `--no-llm` and falls through to the heuristic-only path. New default behavior, new graceful fallback to support it. `--no-llm` is the new explicit opt-out. 4. **`init` shows you what it detected.** A one-line banner — `Detected: Claude (Anthropic) (user: Jordan, agents: Echo, Sparrow, Cipher)` or `Corpus origin: not AI-dialogue (confidence: 0.98)` — tells you at a glance whether mempalace understood your folder. 5. **Entity classification gets smarter across the board.** Even non-persona candidates benefit: the LLM has corpus context (this is AI-dialogue, this is the user's name, these are agent names) and uses it to disambiguate ambiguous candidates that aren't personas at all. 6. **Agent personas live in their own bucket.** Names you've assigned to AI agents (Echo, Sparrow, Cipher) go into a new `agent_personas` bucket instead of your `people` list. Your real-person entity list stays clean. 7. **Detection result persists to `<palace>/.mempalace/origin.json`** with a `schema_version: 1` envelope, so downstream tools can read it. 8. **Re-running `init` is now idempotent.** Bug fix — running `init` twice on the same folder used to give different classification results because the detection step was sampling its own `entities.json` output. Caught by integration testing during this PR. **Later — when your folder grows:** 9. **`mempalace mine --redetect-origin`** is a new flag for refreshing the stored detection without redoing the whole `init`. Heuristic-only by design (the flag is meant to be cheap). If you want the full LLM-extracted detection refreshed (persona names, user name, etc.), run `mempalace init <yourfolder>` again — `init` is now idempotent (item 8), so re-running it on the same folder is safe. ## Behind the changes - **New module** `mempalace/corpus_origin.py` (422 lines) with two-tier detection: regex heuristic with co-occurrence rule (suppresses ambiguous terms like `Claude` / `Gemini` / `Haiku` when no unambiguous AI signal is present, so French novels, astrology forums, poetry corpora, llama-rancher journals don't false-positive), and LLM tier that extracts `user_name` and `agent_persona_names` from dialogue structure with belt-and-suspenders user-vs-agent disambiguation. - **Entity-classification consumer wiring.** `entity_detector.detect_entities` and `project_scanner.discover_entities` accept an optional `corpus_origin` kwarg. When present and the corpus is identified as AI-dialogue, candidates whose name case-insensitively matches an `agent_persona_name` are routed into the `agent_personas` bucket instead of `people`. Per-entity `type` is rewritten to `"agent_persona"`. - **LLM-refine consumer wiring.** `llm_refine.refine_entities` accepts the same `corpus_origin` kwarg and prepends a `CORPUS CONTEXT` preamble to its system prompt giving the LLM the platform / user / persona context. Existing `TOPIC` / `PERSON` / `PROJECT` / `COMMON_WORD` / `AMBIGUOUS` labels are unchanged. - **`init` overhaul.** Pass 0 (corpus-origin detection) inserted before existing Pass 1 (entity discovery). `--llm` flipped to default-on. `--no-llm` added. Graceful-fallback path replaces the previous hard-error on missing LLM. Provider precedence unchanged from the existing `llm_client` module. - **`mine` flag.** `mempalace mine --redetect-origin` re-runs corpus-origin detection on the current corpus state and overwrites `<palace>/.mempalace/origin.json`. - **`CLAUDE.md` design principle reworded** — "Local-first, zero external API by default." Local LLMs running on `localhost` (Ollama, LM Studio, llama.cpp, vLLM, unsloth studio) are part of the user's machine, not external APIs. External BYOK providers (Anthropic, OpenAI, Google) are supported but always opt-in, never default, never silent fallback. ## Cost story - **Anthropic (verified path):** ~\$0.01 per `init` via Haiku 4.5 with `ANTHROPIC_API_KEY`. - **Ollama / local LLM runtime:** zero cost. Fully offline. - **OpenAI-compatible service:** depends entirely on the service. The abstraction supports any service speaking the standard `/v1/chat/completions` API; specific quirks vary per provider. Try it and tell us how it goes. - **No LLM at all:** graceful fallback to heuristic-only. Zero cost. `init` never blocks. ## Backwards compatibility - All public function signatures gained the `corpus_origin` kwarg as optional (default `None`). Callers that don't pass it see the v3.3.3 return shape unchanged — no `agent_personas` key, no behavioral change. - The `--llm` CLI flag is preserved as a deprecated alias of the default. Existing scripts that pass it continue to work. - `corpus_origin=None` keeps `llm_refine.SYSTEM_PROMPT` byte-identical to v3.3.3. ## Test coverage - **19 unit tests** in `tests/test_corpus_origin.py` covering both tiers, the co-occurrence rule, ambiguous-term suppression, word-boundary brand matching, and user/persona disambiguation. - **29 integration tests** in `tests/test_corpus_origin_integration.py` covering end-to-end through `mempalace init`, persona reclassification, the `--redetect-origin` flag, the `--llm` default flip, graceful fallback paths, and re-init idempotency. Of those 29, five specifically cover the intersection with develop's other in-flight work (Pass 0 ↔ auto-mine ordering, topics + agent_personas bucket coexistence, entities.json shape, the `wing=` kwarg threading, llm_refine TOPIC label + corpus_origin preamble composition). - **1354 total mempalace tests pass.** 2 pre-existing environmental failures (`test_mcp_stdio_protection` — chromadb optional dep) unrelated to this change; they fail on plain `develop` too. - **Live-smoke-tested** with real Anthropic Haiku 4.5 on AI-dialogue and narrative fixtures. ## Hygiene guardrail This PR also adds a meta-test (`test_no_internal_coordination_jargon_in_source_or_tests`) that walks the source tree and asserts no internal-coordination jargon (e.g. development-phase markers, internal review-section references) leaks into runtime code, comments, docstrings, or LLM prompts. RED if anything slips in. Allowlist for legitimate RFC/spec section citations in `sources/`, `backends/`, `knowledge_graph.py`, and `i18n/`.
2026-04-25 22:49:09 -07:00
parent 5de5b0923d
commit b99e54546b
10 changed files with 2582 additions and 30 deletions
@@ -0,0 +1,395 @@
+"""Tests for corpus_origin detection.
+
+The corpus-origin detector answers ONE foundational question before any
+downstream Pass 2 classification runs:
+
+    "Is this corpus a record of AI-agent dialogue, and if so, which platform
+     and what persona names has the user assigned to the agent?"
+
+Detection is two-tier:
+  - Tier 1: cheap content-aware heuristic (grep for well-known AI terms
+    and turn markers). No API calls. Always runs.
+  - Tier 2: LLM-assisted confirmation + persona extraction. Takes a small
+    sample of drawer texts and uses Haiku's pre-trained world knowledge
+    about Claude/ChatGPT/Gemini/etc. to confirm platform + identify
+    persona-names the user assigned to the agent.
+
+Default stance: "this IS an AI-dialogue corpus" unless strong evidence
+otherwise. False-negative (missing an AI corpus) is catastrophic for
+downstream classification; false-positive is recoverable via per-drawer
+voice-profile detection in later passes.
+
+TDD: these tests fail until mempalace/corpus_origin.py is implemented."""
+
+from mempalace.corpus_origin import (
+    CorpusOriginResult,
+    detect_origin_heuristic,
+    detect_origin_llm,
+)
+
+
+# ── Tier 1: heuristic (no LLM) ────────────────────────────────────────────
+
+
+class TestHeuristic:
+    def test_claude_heavy_corpus_detected(self):
+        """A corpus with abundant Claude references + turn markers should
+        be confidently detected as AI-dialogue."""
+        samples = [
+            "user: hey Claude, can you help me\nassistant: sure, what do you need\n",
+            "I was talking to Claude Opus about the MCP server setup",
+            "Sonnet 4.5 handled this better than Haiku 4.5 did",
+            "claude mcp add mempalace -- mempalace-mcp",
+            "human: what's up\nassistant: I'm happy to help",
+        ]
+        result = detect_origin_heuristic(samples)
+        assert result.likely_ai_dialogue is True
+        assert result.confidence >= 0.8
+        assert (
+            "Claude" in " ".join(result.evidence) or "claude" in " ".join(result.evidence).lower()
+        )
+
+    def test_gpt_corpus_detected(self):
+        samples = [
+            "I asked ChatGPT to summarize my paper",
+            "The GPT-4 response was surprisingly good",
+            "user: explain quantum computing\nassistant: quantum computing uses qubits",
+            "OpenAI's model was able to help with the code",
+        ]
+        result = detect_origin_heuristic(samples)
+        assert result.likely_ai_dialogue is True
+        assert any("GPT" in e or "ChatGPT" in e or "OpenAI" in e for e in result.evidence)
+
+    def test_pure_narrative_corpus_detected_as_not_ai(self):
+        """A story/journal corpus with no AI signals should be flagged
+        not-AI (default stance flipped only with evidence)."""
+        samples = [
+            "Today the cat finally ventured into the garden. The dog watched.",
+            "The morning light came through the window as I wrote.",
+            "Chapter 3: The Reckoning. It was a dark and stormy night.",
+            "My father's old journal described the same field in 1972.",
+        ]
+        result = detect_origin_heuristic(samples)
+        assert result.likely_ai_dialogue is False
+        assert result.confidence >= 0.8
+
+    def test_ambiguous_corpus_defaults_to_ai(self):
+        """When evidence is thin or mixed, default to assuming AI-dialogue.
+        False-negative is worse than false-positive."""
+        samples = [
+            "some notes about the meeting today",
+            "Later on I went to the store.",
+            "Short file with little signal.",
+        ]
+        result = detect_origin_heuristic(samples)
+        # Low signal → default stance is ai_dialogue=True with low confidence
+        assert result.likely_ai_dialogue is True
+        assert result.confidence <= 0.6
+        assert "default-stance" in " ".join(result.evidence).lower()
+
+    def test_turn_markers_alone_sufficient(self):
+        """Even without AI brand mentions, strong turn-marker presence
+        indicates dialogue structure consistent with AI corpora."""
+        samples = [
+            "user: hello\nassistant: hi there, how can I help?\nuser: summarize X\nassistant: sure",
+            "human: what's the weather\nai: I don't have real-time data\n",
+        ]
+        result = detect_origin_heuristic(samples)
+        assert result.likely_ai_dialogue is True
+
+    # ── Pattern + context (not capitalization, not English-rule) ──────────
+
+    def test_brand_terms_case_insensitive(self):
+        """Detection cannot rely on the user typing proper-cased brand names.
+        Lowercase 'claude code', 'chatgpt', 'gemini-pro', 'mcp' must trip
+        the same as their proper-cased equivalents. NO turn-marker fallback
+        in this corpus — the brand matches must do the work."""
+        samples = [
+            "i love claude code, it just works for refactoring tasks",
+            "asked chatgpt to write a regex and it nailed it on the first try",
+            "switched to gemini-pro for the long-context summary task last week",
+            "added mempalace as an mcp server in my .claude/ settings file",
+            "anthropic's haiku model is cheap enough to run on every drawer",
+        ]
+        result = detect_origin_heuristic(samples)
+        assert (
+            result.likely_ai_dialogue is True
+        ), f"lowercase brand terms missed; evidence: {result.evidence}"
+        # Evidence must show MULTIPLE distinct case-insensitive brand matches.
+        # 'chatgpt' lowercase only matches under case-insensitive search
+        # (the brand list has 'ChatGPT' proper-cased only).
+        evidence_str = " ".join(result.evidence).lower()
+        matched = sum(t in evidence_str for t in ("chatgpt", "anthropic", "haiku", "gemini-pro"))
+        assert (
+            matched >= 2
+        ), f"case-insensitive brand matches did not fire — only got: {result.evidence}"
+
+    def test_zodiac_corpus_not_flagged_as_ai(self):
+        """An astrology forum post with high 'Gemini' density but ZERO
+        unambiguous AI signals (no MCP/LLM/ChatGPT/turn markers) must NOT
+        be flagged as AI-dialogue. Word-sense disambiguation is required:
+        Gemini-the-zodiac-sign vs Gemini-the-AI-platform."""
+        samples = [
+            "I'm a Gemini sun, Pisces moon, and Leo rising.",
+            "Geminis are dreamers and overthinkers — that's the dual nature.",
+            "Compatibility between Gemini and Sagittarius is famously strong.",
+            "If you're a Gemini, expect Mercury retrograde to hit you hardest.",
+            "My horoscope this week says Gemini energy will dominate Wednesday.",
+            "The Gemini twins in Greek mythology are Castor and Pollux.",
+        ]
+        result = detect_origin_heuristic(samples)
+        assert (
+            result.likely_ai_dialogue is False
+        ), f"zodiac corpus wrongly flagged AI; evidence: {result.evidence}"
+
+    def test_french_novel_with_claude_name_not_flagged(self):
+        """A French novel where 'Claude' is a character name (Claude is a
+        common French masculine name) must NOT trip AI-dialogue detection.
+        Disambiguation is by context, not by the presence of the word."""
+        samples = [
+            "Claude marchait lentement le long de la Seine ce matin-là.",
+            "« Claude, tu rentres dîner? » lui demanda sa mère depuis la cuisine.",
+            "Pour Claude, l'art de vivre passait avant tout par la patience.",
+            "Le vieux Claude se souvenait encore de la guerre, des champs déserts.",
+            "Claude ouvrit la fenêtre. Le matin sentait le pain frais et la pluie.",
+            "Les amis de Claude s'étaient réunis chez lui pour fêter ses soixante ans.",
+        ]
+        result = detect_origin_heuristic(samples)
+        assert (
+            result.likely_ai_dialogue is False
+        ), f"French novel wrongly flagged AI; evidence: {result.evidence}"
+
+    def test_poetry_corpus_with_haiku_sonnet_not_flagged(self):
+        """A poetry corpus with high 'haiku', 'sonnet', 'opus' density
+        (poetic forms / classical music terms) but no AI infrastructure
+        terms must NOT be flagged as AI-dialogue."""
+        samples = [
+            "A haiku is seventeen syllables across three lines: 5-7-5.",
+            "Shakespeare's sonnet 18 remains the most quoted in the English canon.",
+            "Beethoven's opus 27 includes the Moonlight Sonata.",
+            "I wrote three haiku this morning before coffee.",
+            "The sonnet form arrived in England via Wyatt and Surrey.",
+            "Her first opus, published at twenty, was a song cycle for soprano.",
+        ]
+        result = detect_origin_heuristic(samples)
+        assert (
+            result.likely_ai_dialogue is False
+        ), f"poetry corpus wrongly flagged AI; evidence: {result.evidence}"
+
+    def test_word_boundary_brand_matching(self):
+        """Brand-term matching must use word boundaries. Embedded matches
+        inside larger words ('Claudette' → 'Claude', 'opuscule' → 'Opus',
+        'sonneteer' → 'Sonnet', 'llamas' → 'Llama', 'bardic' → 'Bard')
+        must NOT be counted as brand hits.
+
+        Word boundaries don't change classification on the co-occurrence-
+        suppressed cases, but they clean up the evidence strings — false
+        matches must not appear in the audit trail. They also prevent
+        'Claude Code' from triple-counting as 'Claude Code' + 'Claude'
+        overlap."""
+        samples = [
+            "My grandmother Claudette baked the most beautiful tarts every Sunday.",
+            "Two llamas were spotted near the trailhead this morning at sunrise.",
+            "Beethoven's opuscule for solo violin remained unpublished for decades.",
+            "She studied to become a sonneteer after reading the full Spenser cycle.",
+            "Bardic traditions in the Hebrides survived well into the eighteenth century.",
+            "The complete opuses of Mozart fill an entire wall of the library.",
+        ]
+        result = detect_origin_heuristic(samples)
+        evidence_str = " ".join(result.evidence).lower()
+
+        # None of the brand terms should show up in evidence — every
+        # would-be match is an embedded false-positive that word
+        # boundaries should suppress.
+        for embedded_term in ("claude", "opus", "sonnet", "llama", "bard"):
+            assert f"'{embedded_term}'" not in evidence_str, (
+                f"word-boundary bug: '{embedded_term}' falsely matched inside "
+                f"a longer word — evidence: {result.evidence}"
+            )
+
+        # And classification should be not-AI (no real AI signals present).
+        assert (
+            result.likely_ai_dialogue is False
+        ), f"corpus has no real AI signals; evidence: {result.evidence}"
+
+    def test_ambiguous_brand_with_unambiguous_signal_flagged(self):
+        """When an ambiguous brand term ('Gemini') co-occurs with an
+        UNAMBIGUOUS AI signal (turn markers, MCP, ChatGPT, Claude Code)
+        in the same corpus, the Gemini hits SHOULD count and the corpus
+        SHOULD be flagged as AI-dialogue."""
+        samples = [
+            "Switched the agent from Gemini to ChatGPT mid-session for cost reasons.",
+            "Gemini handled the long-context task; user: please summarize\nassistant: here is the summary",
+            "user: try Gemini for this\nassistant: running it through gemini-pro now",
+            "MCP server config: Gemini as primary, OpenAI as fallback.",
+        ]
+        result = detect_origin_heuristic(samples)
+        assert (
+            result.likely_ai_dialogue is True
+        ), f"ambiguous+unambiguous co-occurrence missed; evidence: {result.evidence}"
+
+
+# ── Tier 2: LLM-assisted (mocked) ─────────────────────────────────────────
+
+
+class _FakeProvider:
+    """Minimal stand-in for mempalace's LLMProvider used for testing."""
+
+    def __init__(self, canned_response):
+        self._response = canned_response
+        self.calls = []
+
+    def classify(self, system, user, json_mode=True):
+        self.calls.append({"system": system, "user": user})
+
+        class R:
+            text = self._response
+
+        return R()
+
+    def check_available(self):
+        return True, "ok"
+
+
+class TestLLMConfirmation:
+    def test_extracts_persona_names_and_platform(self):
+        fake_response = """{
+          "is_ai_dialogue_corpus": true,
+          "confidence": 0.97,
+          "primary_platform": "Claude Code (Anthropic CLI)",
+          "agent_persona_names": ["Echo", "Sparrow", "Cipher", "Orc"],
+          "evidence": [
+            "user addresses agent as 'Echo' on assistant turns",
+            "Claude Code banner text in samples",
+            "references to MCP, CLAUDE.md, hooks"
+          ]
+        }"""
+        provider = _FakeProvider(fake_response)
+        samples = [
+            "user: hey Echo, what's up\nassistant: I'm here, what do you need\n",
+            "Claude Code session banner Sonnet 4.5 Claude Pro",
+        ]
+        result = detect_origin_llm(samples, provider)
+        assert result.likely_ai_dialogue is True
+        assert result.confidence >= 0.9
+        assert "Echo" in result.agent_persona_names
+        assert "Sparrow" in result.agent_persona_names
+        assert "Claude" in result.primary_platform
+
+    def test_narrative_corpus_llm_confirms_no_agent(self):
+        fake_response = """{
+          "is_ai_dialogue_corpus": false,
+          "confidence": 0.95,
+          "primary_platform": null,
+          "agent_persona_names": [],
+          "evidence": ["pure narrative prose, no turn markers, no AI terms"]
+        }"""
+        provider = _FakeProvider(fake_response)
+        samples = ["Once upon a time in a small village", "The old woman smiled"]
+        result = detect_origin_llm(samples, provider)
+        assert result.likely_ai_dialogue is False
+        assert result.agent_persona_names == []
+        assert result.primary_platform is None
+
+    def test_handles_malformed_llm_response(self):
+        """If the LLM returns garbage, fall back gracefully to the
+        conservative default (assume AI-dialogue with low confidence)."""
+        provider = _FakeProvider("not even close to JSON")
+        result = detect_origin_llm(["sample text"], provider)
+        # Fallback: conservative default, low confidence
+        assert result.likely_ai_dialogue is True
+        assert result.confidence <= 0.5
+        assert (
+            "fallback" in " ".join(result.evidence).lower()
+            or "error" in " ".join(result.evidence).lower()
+        )
+
+    def test_filters_user_name_out_of_personas(self):
+        """Regression test: Haiku sometimes leaks the user's own name into
+        agent_persona_names despite the prompt's CRITICAL distinction. The
+        parser must strip the user's name from personas if it appears in
+        both fields (case-insensitive). The user is the human author of
+        the corpus, not an agent persona."""
+        fake_response = """{
+          "is_ai_dialogue_corpus": true,
+          "confidence": 0.97,
+          "primary_platform": "Claude (Anthropic)",
+          "user_name": "Jordan",
+          "agent_persona_names": ["Echo", "Sparrow", "Jordan", "Cipher"],
+          "evidence": ["user Jordan talks to agents Echo/Sparrow/Cipher"]
+        }"""
+        provider = _FakeProvider(fake_response)
+        result = detect_origin_llm(["sample"], provider)
+        # user_name is exposed in its own field
+        assert result.user_name == "Jordan"
+        # "Jordan" is filtered out of agent_persona_names
+        assert "Jordan" not in result.agent_persona_names
+        # Real personas are preserved
+        for persona in ("Echo", "Sparrow", "Cipher"):
+            assert persona in result.agent_persona_names
+
+    def test_filter_is_case_insensitive(self):
+        """The user-name filter works even when the LLM returns a casing
+        mismatch between user_name and the personas list."""
+        fake_response = """{
+          "is_ai_dialogue_corpus": true,
+          "confidence": 0.9,
+          "primary_platform": "Claude",
+          "user_name": "Jordan",
+          "agent_persona_names": ["Echo", "jordan", "JORDAN", "Cipher"],
+          "evidence": []
+        }"""
+        provider = _FakeProvider(fake_response)
+        result = detect_origin_llm(["sample"], provider)
+        # All case-variants of the user's name are filtered
+        assert "jordan" not in [p.lower() for p in result.agent_persona_names]
+        assert result.agent_persona_names == ["Echo", "Cipher"]
+
+    def test_user_name_field_surfaces_author(self):
+        """The user_name field captures the human author of the corpus,
+        separate from agent personas. This gives downstream passes a
+        clear 'who is the user, who is the agent' distinction."""
+        fake_response = """{
+          "is_ai_dialogue_corpus": true,
+          "confidence": 0.95,
+          "primary_platform": "ChatGPT (OpenAI)",
+          "user_name": "Sarah",
+          "agent_persona_names": ["MyAssistant"],
+          "evidence": ["Sarah writes to MyAssistant"]
+        }"""
+        provider = _FakeProvider(fake_response)
+        result = detect_origin_llm(["sample"], provider)
+        assert result.user_name == "Sarah"
+        assert result.agent_persona_names == ["MyAssistant"]
+
+
+# ── CorpusOriginResult dataclass ──────────────────────────────────────────
+
+
+class TestResultDataclass:
+    def test_result_has_all_fields(self):
+        r = CorpusOriginResult(
+            likely_ai_dialogue=True,
+            confidence=0.95,
+            primary_platform="Claude Code",
+            agent_persona_names=["Echo"],
+            evidence=["test"],
+        )
+        assert r.likely_ai_dialogue is True
+        assert r.confidence == 0.95
+        assert r.primary_platform == "Claude Code"
+        assert r.agent_persona_names == ["Echo"]
+        assert r.evidence == ["test"]
+
+    def test_result_serializes_to_dict(self):
+        r = CorpusOriginResult(
+            likely_ai_dialogue=False,
+            confidence=0.9,
+            primary_platform=None,
+            agent_persona_names=[],
+            evidence=[],
+        )
+        d = r.to_dict()
+        assert d["likely_ai_dialogue"] is False
+        assert d["primary_platform"] is None
+        assert d["agent_persona_names"] == []