b99e54546b
10 files changed. 2,563 insertions, 30 deletions. 48 new tests, including end-to-end coverage live-tested with Anthropic Haiku 4.5. This PR overhauls the first-run experience of `mempalace init` end-to-end, ships a new corpus-origin detection module from scratch, wires it into entity classification and LLM refinement, adds a graceful-fallback path that means `init` never crashes on a missing LLM, and ships a meta-test that prevents internal-coordination jargon from leaking into source or tests. The headline change is that `mempalace init` now understands what kind of folder you're pointing it at — AI conversations, regular writing, code, narrative — and adapts how it classifies entities accordingly. The same folder containing `Echo`, `Sparrow`, and `Cipher` (names you've assigned to AI agents) used to dump those into your "people" list alongside biological humans. Now they go into a separate `agent_personas` bucket, and your `people` list stays clean. But the broader change is that `mempalace init` got upgraded across the board — smarter defaults, smarter degradation, smarter classification, smarter persistence, and a new way to refresh as your folder grows. Built and live-verified with Anthropic Haiku 4.5; runs unmodified on the local LLM runtimes mempalace already supports. ## What changes for users (in order, from `pip install` onwards) **Install** — `pip install mempalace` is unchanged. The package itself didn't shift. **First run — `mempalace init <folder>`:** 1. **`init` examines your folder before classifying anything.** A free regex heuristic decides in milliseconds: AI conversations, regular writing, narrative, or code? If an LLM is reachable, a second pass extracts the corpus author's name and any agent persona names from the dialogue. v3.3.3 had no such step — it dove straight into entity detection with no corpus context. 2. **LLM-assisted classification is now ON by default.** v3.3.3 made `--llm` opt-in. The LLM-assisted path is qualitatively better (extracts persona names, refines ambiguous classifications, gives the model corpus context) so it now runs by default. The provider abstraction is unchanged from v3.3.3 — three buckets are supported by `mempalace.llm_client`: - **Anthropic** (`--llm-provider anthropic` + `ANTHROPIC_API_KEY`) — the official Messages API. **This is the path live-verified end-to-end in this PR with Haiku 4.5.** Cost: ~\$0.01 per `init`. - **Ollama** (`--llm-provider ollama` — the default) — local models via `http://localhost:11434`. Fully offline. Honors the "zero-API required" promise. - **OpenAI-compatible** (`--llm-provider openai-compat` + `--llm-endpoint`) — per the v3.3.3 `mempalace/llm_client.py` docstring, this covers "OpenRouter, LM Studio, llama.cpp server, vLLM, Groq, Fireworks, Together, and most self-hosted setups." We did not test each of those individually as part of this PR; the abstraction has been stable since v3.3.3. If you try this PR with a specific provider and hit a quirk, please file an issue or comment here. 3. **`init` never blocks on a missing LLM.** No Ollama running, no API key set? `init` prints a one-line message pointing at `--no-llm` and falls through to the heuristic-only path. New default behavior, new graceful fallback to support it. `--no-llm` is the new explicit opt-out. 4. **`init` shows you what it detected.** A one-line banner — `Detected: Claude (Anthropic) (user: Jordan, agents: Echo, Sparrow, Cipher)` or `Corpus origin: not AI-dialogue (confidence: 0.98)` — tells you at a glance whether mempalace understood your folder. 5. **Entity classification gets smarter across the board.** Even non-persona candidates benefit: the LLM has corpus context (this is AI-dialogue, this is the user's name, these are agent names) and uses it to disambiguate ambiguous candidates that aren't personas at all. 6. **Agent personas live in their own bucket.** Names you've assigned to AI agents (Echo, Sparrow, Cipher) go into a new `agent_personas` bucket instead of your `people` list. Your real-person entity list stays clean. 7. **Detection result persists to `<palace>/.mempalace/origin.json`** with a `schema_version: 1` envelope, so downstream tools can read it. 8. **Re-running `init` is now idempotent.** Bug fix — running `init` twice on the same folder used to give different classification results because the detection step was sampling its own `entities.json` output. Caught by integration testing during this PR. **Later — when your folder grows:** 9. **`mempalace mine --redetect-origin`** is a new flag for refreshing the stored detection without redoing the whole `init`. Heuristic-only by design (the flag is meant to be cheap). If you want the full LLM-extracted detection refreshed (persona names, user name, etc.), run `mempalace init <yourfolder>` again — `init` is now idempotent (item 8), so re-running it on the same folder is safe. ## Behind the changes - **New module** `mempalace/corpus_origin.py` (422 lines) with two-tier detection: regex heuristic with co-occurrence rule (suppresses ambiguous terms like `Claude` / `Gemini` / `Haiku` when no unambiguous AI signal is present, so French novels, astrology forums, poetry corpora, llama-rancher journals don't false-positive), and LLM tier that extracts `user_name` and `agent_persona_names` from dialogue structure with belt-and-suspenders user-vs-agent disambiguation. - **Entity-classification consumer wiring.** `entity_detector.detect_entities` and `project_scanner.discover_entities` accept an optional `corpus_origin` kwarg. When present and the corpus is identified as AI-dialogue, candidates whose name case-insensitively matches an `agent_persona_name` are routed into the `agent_personas` bucket instead of `people`. Per-entity `type` is rewritten to `"agent_persona"`. - **LLM-refine consumer wiring.** `llm_refine.refine_entities` accepts the same `corpus_origin` kwarg and prepends a `CORPUS CONTEXT` preamble to its system prompt giving the LLM the platform / user / persona context. Existing `TOPIC` / `PERSON` / `PROJECT` / `COMMON_WORD` / `AMBIGUOUS` labels are unchanged. - **`init` overhaul.** Pass 0 (corpus-origin detection) inserted before existing Pass 1 (entity discovery). `--llm` flipped to default-on. `--no-llm` added. Graceful-fallback path replaces the previous hard-error on missing LLM. Provider precedence unchanged from the existing `llm_client` module. - **`mine` flag.** `mempalace mine --redetect-origin` re-runs corpus-origin detection on the current corpus state and overwrites `<palace>/.mempalace/origin.json`. - **`CLAUDE.md` design principle reworded** — "Local-first, zero external API by default." Local LLMs running on `localhost` (Ollama, LM Studio, llama.cpp, vLLM, unsloth studio) are part of the user's machine, not external APIs. External BYOK providers (Anthropic, OpenAI, Google) are supported but always opt-in, never default, never silent fallback. ## Cost story - **Anthropic (verified path):** ~\$0.01 per `init` via Haiku 4.5 with `ANTHROPIC_API_KEY`. - **Ollama / local LLM runtime:** zero cost. Fully offline. - **OpenAI-compatible service:** depends entirely on the service. The abstraction supports any service speaking the standard `/v1/chat/completions` API; specific quirks vary per provider. Try it and tell us how it goes. - **No LLM at all:** graceful fallback to heuristic-only. Zero cost. `init` never blocks. ## Backwards compatibility - All public function signatures gained the `corpus_origin` kwarg as optional (default `None`). Callers that don't pass it see the v3.3.3 return shape unchanged — no `agent_personas` key, no behavioral change. - The `--llm` CLI flag is preserved as a deprecated alias of the default. Existing scripts that pass it continue to work. - `corpus_origin=None` keeps `llm_refine.SYSTEM_PROMPT` byte-identical to v3.3.3. ## Test coverage - **19 unit tests** in `tests/test_corpus_origin.py` covering both tiers, the co-occurrence rule, ambiguous-term suppression, word-boundary brand matching, and user/persona disambiguation. - **29 integration tests** in `tests/test_corpus_origin_integration.py` covering end-to-end through `mempalace init`, persona reclassification, the `--redetect-origin` flag, the `--llm` default flip, graceful fallback paths, and re-init idempotency. Of those 29, five specifically cover the intersection with develop's other in-flight work (Pass 0 ↔ auto-mine ordering, topics + agent_personas bucket coexistence, entities.json shape, the `wing=` kwarg threading, llm_refine TOPIC label + corpus_origin preamble composition). - **1354 total mempalace tests pass.** 2 pre-existing environmental failures (`test_mcp_stdio_protection` — chromadb optional dep) unrelated to this change; they fail on plain `develop` too. - **Live-smoke-tested** with real Anthropic Haiku 4.5 on AI-dialogue and narrative fixtures. ## Hygiene guardrail This PR also adds a meta-test (`test_no_internal_coordination_jargon_in_source_or_tests`) that walks the source tree and asserts no internal-coordination jargon (e.g. development-phase markers, internal review-section references) leaks into runtime code, comments, docstrings, or LLM prompts. RED if anything slips in. Allowlist for legitimate RFC/spec section citations in `sources/`, `backends/`, `knowledge_graph.py`, and `i18n/`.
423 lines
17 KiB
Python
423 lines
17 KiB
Python
"""
|
|
corpus_origin.py — Detect whether a corpus is an AI-dialogue record and,
|
|
if so, what platform and what persona names the user has assigned to the
|
|
agent.
|
|
|
|
This is the first question any downstream Pass 2 classification needs
|
|
answered. Without it, a drawer like "my three sons" in a Claude Code
|
|
dialogue corpus can't be correctly resolved to "three AI instances"
|
|
rather than "three biological children."
|
|
|
|
Two-tier detection:
|
|
|
|
Tier 1 — detect_origin_heuristic(samples)
|
|
Cheap, no API. Grep for well-known AI brand terms + turn
|
|
markers. Always runs. Outputs a hypothesis.
|
|
|
|
Tier 2 — detect_origin_llm(samples, provider)
|
|
Uses an LLMProvider (typically Haiku via mempalace.llm_client)
|
|
with the model's pre-trained knowledge of Claude/ChatGPT/Gemini
|
|
etc. Confirms platform, extracts agent persona-names the user
|
|
has assigned. One call, ~$0.01 cost.
|
|
|
|
Design principle:
|
|
Don't make the classifier re-discover what Claude, ChatGPT, Gemini, MCP,
|
|
or other well-known entities ARE — the LLM already knows them from its
|
|
training. Only corpus-specific entities (e.g. the user's persona-name
|
|
for their Claude instance) need discovery.
|
|
|
|
Default stance (when evidence is thin):
|
|
"This IS an AI-dialogue corpus" — false-negative is catastrophic for
|
|
downstream classification; false-positive is recoverable via per-drawer
|
|
voice-profile detection in later passes.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import json
|
|
import re
|
|
from dataclasses import dataclass, field, asdict
|
|
from typing import Optional
|
|
|
|
|
|
# ── Well-known AI brand terms (expand as new platforms emerge) ────────────
|
|
# Detection is by PATTERN + CONTEXT, not by capitalization or English-language
|
|
# rules. Two categories:
|
|
#
|
|
# UNAMBIGUOUS — terms that have essentially no meaning outside of AI context.
|
|
# Always counted toward AI-dialogue evidence.
|
|
#
|
|
# AMBIGUOUS — terms that share a string with common English words, names,
|
|
# poetry forms, zodiac signs, animals, etc. Counted toward AI-dialogue
|
|
# evidence ONLY when at least one unambiguous AI signal also appears in
|
|
# the corpus (turn marker, unambiguous brand term, or AI infrastructure
|
|
# term). This avoids false-positives on French novels with characters
|
|
# named "Claude", astrology corpora discussing "Gemini", poetry corpora
|
|
# full of "haiku" / "sonnet", etc.
|
|
#
|
|
# All matching is CASE-INSENSITIVE — users type lowercase constantly.
|
|
|
|
_AI_UNAMBIGUOUS_TERMS = [
|
|
# Anthropic-specific
|
|
"Anthropic",
|
|
"Claude Code",
|
|
"Claude 3",
|
|
"Claude 4",
|
|
"claude mcp",
|
|
"CLAUDE.md",
|
|
".claude/",
|
|
# OpenAI-specific
|
|
"ChatGPT",
|
|
"GPT-4",
|
|
"GPT-3",
|
|
"GPT-5",
|
|
"OpenAI",
|
|
"gpt-4o",
|
|
"gpt-4-turbo",
|
|
"o1-preview",
|
|
"o3",
|
|
# Google-specific
|
|
"gemini-pro",
|
|
"gemini-1.5",
|
|
"Google AI",
|
|
# Meta / others (specific model identifiers, not bare common words)
|
|
"Mixtral",
|
|
"Cohere",
|
|
# AI-infrastructure terms with no common-English collision
|
|
"MCP",
|
|
"LLM",
|
|
"RAG",
|
|
"fine-tune",
|
|
"context window",
|
|
"embedding",
|
|
]
|
|
|
|
_AI_AMBIGUOUS_TERMS = [
|
|
# Anthropic — bare brand/model names that collide with names + poetry
|
|
"Claude", # also a common French masculine name
|
|
"Opus", # also a musical work, comic strip, magazine
|
|
"Sonnet", # also a 14-line poem form
|
|
"Haiku", # also a 17-syllable poem form
|
|
# Google — bare brand that collides with zodiac sign
|
|
"Gemini", # also the zodiac sign
|
|
"Bard", # also a poet / Shakespeare
|
|
# Meta / others
|
|
"Llama", # also the South American animal
|
|
"Mistral", # also a Mediterranean wind
|
|
# Note: 'prompt', 'completion', 'tokens' previously lived here but were
|
|
# removed: they're suppressed without an unambiguous co-signal anyway,
|
|
# and by the time a co-signal is present the corpus is already flagged.
|
|
# Keeping them just produced noisier evidence strings.
|
|
]
|
|
|
|
# Turn-marker patterns commonly seen in AI-dialogue transcripts
|
|
_TURN_MARKERS = [
|
|
r"\buser\s*:\s*",
|
|
r"\bassistant\s*:\s*",
|
|
r"\bhuman\s*:\s*",
|
|
r"\bai\s*:\s*",
|
|
r"\b>>>\s*User\b",
|
|
r"\b>>>\s*Assistant\b",
|
|
]
|
|
|
|
|
|
def _brand_pattern(term: str) -> str:
|
|
"""Build a regex for a brand term that uses word boundaries
|
|
only on edges where the term itself starts/ends with a word
|
|
character. Without this nuance:
|
|
- 'Claude' would falsely match inside 'Claudette' (no \\b)
|
|
- '.claude/' would fail to match at start of string (\\b
|
|
before non-word char requires preceding word char)
|
|
So we only attach \\b where it actually makes sense."""
|
|
escaped = re.escape(term)
|
|
prefix = r"\b" if term[0].isalnum() or term[0] == "_" else ""
|
|
suffix = r"\b" if term[-1].isalnum() or term[-1] == "_" else ""
|
|
return prefix + escaped + suffix
|
|
|
|
|
|
@dataclass
|
|
class CorpusOriginResult:
|
|
"""Structured output from corpus-origin detection.
|
|
|
|
Fields:
|
|
likely_ai_dialogue — best hypothesis about whether this is AI-dialogue
|
|
confidence — 0.0 to 1.0
|
|
primary_platform — e.g. "Claude Code (Anthropic CLI)" or None
|
|
user_name — the corpus author's name if identifiable from context, else None
|
|
agent_persona_names — names the user has assigned to the AI agent(s)
|
|
(e.g. ["Echo", "Sparrow"]). Does NOT include the user's own name.
|
|
evidence — human-readable reasons for the classification
|
|
"""
|
|
|
|
likely_ai_dialogue: bool
|
|
confidence: float
|
|
primary_platform: Optional[str]
|
|
user_name: Optional[str] = None
|
|
agent_persona_names: list[str] = field(default_factory=list)
|
|
evidence: list[str] = field(default_factory=list)
|
|
|
|
def to_dict(self) -> dict:
|
|
return asdict(self)
|
|
|
|
|
|
# ── Tier 1: cheap heuristic ───────────────────────────────────────────────
|
|
|
|
|
|
def detect_origin_heuristic(samples: list[str]) -> CorpusOriginResult:
|
|
"""Fast grep-based detection. No API calls.
|
|
|
|
Scores AI-dialogue likelihood by counting:
|
|
- occurrences of well-known AI brand terms
|
|
- turn-marker patterns (user:, assistant:, etc.)
|
|
|
|
Returns a CorpusOriginResult with confidence derived from signal density.
|
|
"""
|
|
combined = "\n\n".join(samples)
|
|
total_chars = max(1, len(combined))
|
|
|
|
# Count UNAMBIGUOUS brand-term hits (case-insensitive — users type
|
|
# lowercase constantly, so 'chatgpt' must trip the same as 'ChatGPT').
|
|
# Word boundaries prevent false in-word matches (see _brand_pattern).
|
|
unambiguous_hits: dict[str, int] = {}
|
|
total_unambiguous = 0
|
|
for term in _AI_UNAMBIGUOUS_TERMS:
|
|
matches = re.findall(_brand_pattern(term), combined, re.IGNORECASE)
|
|
if matches:
|
|
unambiguous_hits[term] = len(matches)
|
|
total_unambiguous += len(matches)
|
|
|
|
# Count AMBIGUOUS brand-term hits separately. These will only be
|
|
# counted toward AI-dialogue evidence if the corpus also contains
|
|
# at least one unambiguous AI signal — see co-occurrence rule below.
|
|
ambiguous_hits: dict[str, int] = {}
|
|
total_ambiguous = 0
|
|
for term in _AI_AMBIGUOUS_TERMS:
|
|
matches = re.findall(_brand_pattern(term), combined, re.IGNORECASE)
|
|
if matches:
|
|
ambiguous_hits[term] = len(matches)
|
|
total_ambiguous += len(matches)
|
|
|
|
# Count turn-marker hits (case-insensitive — transcripts vary).
|
|
turn_hits = 0
|
|
turn_types_found = set()
|
|
for pattern in _TURN_MARKERS:
|
|
matches = re.findall(pattern, combined, re.IGNORECASE)
|
|
if matches:
|
|
turn_hits += len(matches)
|
|
turn_types_found.add(pattern)
|
|
|
|
# Co-occurrence rule for ambiguous terms.
|
|
# Ambiguous terms (e.g. 'Claude' as a French name, 'Gemini' as a zodiac
|
|
# sign, 'Haiku' as a poem form) only count toward brand evidence if
|
|
# the corpus also contains at least one unambiguous AI signal. Otherwise
|
|
# we'd false-positive on French novels, astrology forums, poetry corpora,
|
|
# llama-rancher journals, etc.
|
|
has_ai_context = total_unambiguous > 0 or turn_hits > 0
|
|
counted_brand_hits = total_unambiguous + (total_ambiguous if has_ai_context else 0)
|
|
|
|
# Brand-term density per 1000 chars; turn-marker density likewise.
|
|
# Tuned on a small set of examples; these aren't magic numbers and
|
|
# can be revisited as we see more corpora.
|
|
brand_density = counted_brand_hits / (total_chars / 1000)
|
|
turn_density = turn_hits / (total_chars / 1000)
|
|
|
|
# Build evidence list
|
|
evidence: list[str] = []
|
|
shown_hits = dict(unambiguous_hits)
|
|
if has_ai_context:
|
|
shown_hits.update(ambiguous_hits)
|
|
if shown_hits:
|
|
top_terms = sorted(shown_hits.items(), key=lambda x: -x[1])[:5]
|
|
evidence.append("AI brand terms: " + ", ".join(f"'{k}' ({v}x)" for k, v in top_terms))
|
|
elif ambiguous_hits and not has_ai_context:
|
|
# Be transparent that we saw ambiguous matches but suppressed them
|
|
# for lack of co-occurring AI context.
|
|
suppressed = sorted(ambiguous_hits.items(), key=lambda x: -x[1])[:3]
|
|
evidence.append(
|
|
"Ambiguous terms present but suppressed (no co-occurring AI signal): "
|
|
+ ", ".join(f"'{k}' ({v}x)" for k, v in suppressed)
|
|
)
|
|
if turn_hits:
|
|
evidence.append(
|
|
f"Turn markers detected: {turn_hits} occurrences across {len(turn_types_found)} pattern types"
|
|
)
|
|
|
|
# Decision logic:
|
|
# strong signal (brand OR turn hits both >= threshold) → confident AI-dialogue
|
|
# MEANINGFUL absence (enough text, zero brand, zero turn) → confident narrative
|
|
# ambiguous or insufficient text → default stance: AI-dialogue with low confidence
|
|
#
|
|
# Threshold for "meaningful absence": the samples collectively have to
|
|
# be long enough that the absence of AI signals would be expected to
|
|
# surface if the corpus really is narrative. 150 chars is the working
|
|
# floor — below that, we cannot confidently say "this is narrative."
|
|
MEANINGFUL_TEXT_FLOOR = 150
|
|
|
|
if brand_density >= 0.5 or turn_density >= 2.0:
|
|
return CorpusOriginResult(
|
|
likely_ai_dialogue=True,
|
|
confidence=min(0.95, 0.6 + 0.1 * (brand_density + turn_density)),
|
|
primary_platform=None, # tier 2 will refine
|
|
evidence=evidence,
|
|
)
|
|
if counted_brand_hits == 0 and turn_hits == 0 and total_chars >= MEANINGFUL_TEXT_FLOOR:
|
|
# Note: ambiguous-only matches (e.g. a French novel with 'Claude' as
|
|
# a character name) flow through here because counted_brand_hits == 0
|
|
# when no unambiguous AI signal co-occurs. The 'evidence' list still
|
|
# records that the ambiguous matches were seen and suppressed.
|
|
narrative_evidence = list(evidence) + [
|
|
f"no unambiguous AI signal across {total_chars} chars of text — pure narrative"
|
|
]
|
|
return CorpusOriginResult(
|
|
likely_ai_dialogue=False,
|
|
confidence=0.9,
|
|
primary_platform=None,
|
|
evidence=narrative_evidence,
|
|
)
|
|
# Ambiguous or too-short-to-tell case: default stance is AI-dialogue
|
|
# with explicit low confidence. Tier 2 (LLM) should be called to confirm.
|
|
reason = "weak signal" if (counted_brand_hits or turn_hits) else "insufficient text"
|
|
return CorpusOriginResult(
|
|
likely_ai_dialogue=True,
|
|
confidence=0.4,
|
|
primary_platform=None,
|
|
evidence=evidence
|
|
+ [
|
|
f"{reason} — applying default-stance (ai_dialogue=True, low confidence). "
|
|
"Tier 2 LLM check recommended to confirm or override."
|
|
],
|
|
)
|
|
|
|
|
|
# ── Tier 2: LLM-assisted confirmation + persona extraction ────────────────
|
|
|
|
|
|
_SYSTEM_PROMPT = """You are analyzing a corpus of text to determine whether it is a \
|
|
record of conversations with an AI agent (e.g. Claude, ChatGPT, Gemini, custom LLM \
|
|
apps), or some other kind of text (personal narrative, story, research notes, \
|
|
journal, code, etc.).
|
|
|
|
Use your pre-existing knowledge of well-known AI platforms. You don't need the \
|
|
corpus to explain what Claude or ChatGPT is — you already know. Your job is to \
|
|
detect evidence of their presence and identify what persona-names the user has \
|
|
assigned to the agent(s) they converse with.
|
|
|
|
CRITICAL distinction:
|
|
- agent_persona_names are names the USER has assigned to the AI AGENT(S)
|
|
they converse with. Example: "Echo", "Sparrow", "Henry" might be names
|
|
the user calls a Claude instance they're building a relationship with.
|
|
- Do NOT include the USER's own name in agent_persona_names. The user
|
|
is the human author of the corpus, not a persona of the agent. Even
|
|
if the user's name appears frequently in the text (writing about
|
|
themselves), that is NOT an agent persona.
|
|
- If you can identify the user's name from context, put it in user_name
|
|
(separate field). If unclear, leave user_name null.
|
|
|
|
Respond with JSON only (no prose before or after):
|
|
{
|
|
"is_ai_dialogue_corpus": <true|false>,
|
|
"confidence": <0.0 to 1.0>,
|
|
"primary_platform": <"Claude (Anthropic)" | "ChatGPT (OpenAI)" | "Gemini (Google)" | other platform name | null>,
|
|
"user_name": <user's name if clearly identifiable from context, else null>,
|
|
"agent_persona_names": [<names the user has assigned to the AI AGENT(S), NOT the user's own name>],
|
|
"evidence": [<short bullet strings explaining the decision>]
|
|
}
|
|
|
|
Default stance: if evidence is thin or mixed, return is_ai_dialogue_corpus=true \
|
|
with low confidence. False-negatives on AI-dialogue detection break downstream \
|
|
classification; false-positives are recoverable later.
|
|
"""
|
|
|
|
|
|
def _extract_json(text: str) -> Optional[dict]:
|
|
"""Pull the first JSON object out of a possibly-messy LLM response."""
|
|
text = text.strip()
|
|
if not text:
|
|
return None
|
|
# Try straight parse first
|
|
try:
|
|
return json.loads(text)
|
|
except json.JSONDecodeError:
|
|
pass
|
|
# Try to find a {...} block
|
|
start = text.find("{")
|
|
if start < 0:
|
|
return None
|
|
depth = 0
|
|
in_string = False
|
|
escape = False
|
|
for i in range(start, len(text)):
|
|
ch = text[i]
|
|
if in_string:
|
|
if escape:
|
|
escape = False
|
|
elif ch == "\\":
|
|
escape = True
|
|
elif ch == '"':
|
|
in_string = False
|
|
continue
|
|
if ch == '"':
|
|
in_string = True
|
|
elif ch == "{":
|
|
depth += 1
|
|
elif ch == "}":
|
|
depth -= 1
|
|
if depth == 0:
|
|
candidate = text[start : i + 1]
|
|
try:
|
|
return json.loads(candidate)
|
|
except json.JSONDecodeError:
|
|
return None
|
|
return None
|
|
|
|
|
|
def detect_origin_llm(samples: list[str], provider) -> CorpusOriginResult:
|
|
"""LLM-assisted detection. Takes samples (list of drawer-text excerpts)
|
|
and an LLMProvider (mempalace.llm_client.LLMProvider). Returns the
|
|
same CorpusOriginResult shape as the heuristic.
|
|
|
|
Falls back conservatively (default-stance ai=True, low confidence)
|
|
on any LLM error or malformed response — never raises.
|
|
"""
|
|
# Build the user prompt: concise excerpts, capped so we stay cheap
|
|
max_excerpt_chars = 800
|
|
excerpts = "\n\n---\n\n".join(
|
|
f"[sample {i + 1}]\n{s[:max_excerpt_chars]}" for i, s in enumerate(samples[:20])
|
|
)
|
|
user_prompt = f"CORPUS EXCERPTS:\n\n{excerpts}\n\nAnalyze and respond with JSON."
|
|
|
|
try:
|
|
resp = provider.classify(system=_SYSTEM_PROMPT, user=user_prompt, json_mode=True)
|
|
raw = getattr(resp, "text", "") or ""
|
|
except Exception as e:
|
|
return CorpusOriginResult(
|
|
likely_ai_dialogue=True,
|
|
confidence=0.3,
|
|
primary_platform=None,
|
|
evidence=[f"LLM provider error (fallback to default stance): {e}"],
|
|
)
|
|
|
|
parsed = _extract_json(raw)
|
|
if not parsed or not isinstance(parsed, dict):
|
|
return CorpusOriginResult(
|
|
likely_ai_dialogue=True,
|
|
confidence=0.3,
|
|
primary_platform=None,
|
|
evidence=["LLM response was not valid JSON (fallback to default stance)"],
|
|
)
|
|
|
|
# Pull fields defensively. If the LLM leaked the user_name into
|
|
# agent_persona_names despite the prompt telling it not to, filter it out.
|
|
user_name = parsed.get("user_name") or None
|
|
personas = list(parsed.get("agent_persona_names") or [])
|
|
if user_name:
|
|
personas = [p for p in personas if p.lower() != user_name.lower()]
|
|
return CorpusOriginResult(
|
|
likely_ai_dialogue=bool(parsed.get("is_ai_dialogue_corpus", True)),
|
|
confidence=float(parsed.get("confidence", 0.5)),
|
|
primary_platform=parsed.get("primary_platform") or None,
|
|
user_name=user_name,
|
|
agent_persona_names=personas,
|
|
evidence=list(parsed.get("evidence") or []),
|
|
)
|