feat: new MCP tools — get/list/update drawer, hook settings, export (resolves #635) (#667)

* feat: MCP reliability — inode detection, WAL rotation, metadata cache, search limits Infrastructure hardening for the MCP server: - Detect palace DB replacement via inode tracking (repair command support) - WAL rotation to prevent unbounded WAL growth - _fetch_all_metadata() + _get_cached_metadata() with 60s TTL for taxonomy/status - _MAX_RESULTS cap (100) with limit clamping [1, _MAX_RESULTS] - max_distance parameter for similarity threshold in search - Handle all notifications/* methods, null arguments, method=None - Remove duplicate _client_cache = None declarations - searcher.py max_distance parameter passthrough Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: new MCP tools (get/list/update drawer, hook settings, memories filed), export, normalize New MCP tools: - mempalace_get_drawer: fetch single drawer by ID with full content - mempalace_list_drawers: paginated listing with wing/room filter - mempalace_update_drawer: update content/wing/room on existing drawers - mempalace_hook_settings: get/set hook behavior (silent_save, desktop_toast) - mempalace_memories_filed_away: check latest checkpoint status Also includes: - exporter.py: export palace as browsable markdown files - normalize.py: tool_use/tool_result capture for richer transcript mining - layers.py: updated for new tool integration - config.py: hook settings properties (hook_silent_save, hook_desktop_toast) Depends on PR 3 (reliability) for _MAX_RESULTS, _metadata_cache, WAL logging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: normalize.py handles string messages and Read offset type mismatch Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: params null guard, L2→cosine docs, empty tool_use_map key guard - Handle explicit null in MCP params (request.get("params") or {}) - Fix search tool description: L2 → cosine distance (collection uses hnsw:space=cosine) - Guard against empty string key in tool_use_map from malformed JSONL entries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: rename ambiguous var 'l' to 'line' (E741 lint) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address code review findings (5 issues) 1. min_similarity backwards-compat: convert similarity to distance scale (1.0 - similarity) instead of passing raw value as max_distance 2. Restore structured error reporting (error + partial fields) in tool_status, tool_list_wings, tool_list_rooms, tool_get_taxonomy — reverts silent except:pass that dropped #647 security hardening 3. inode cache: remove falsy-zero short-circuit so missing DB file triggers reconnect instead of reusing stale client 4. _fetch_all_metadata: check for empty batch before extending/advancing offset to prevent infinite loop on concurrent deletion 5. KG initialization: only override path when --palace is explicit; default runs use KnowledgeGraph's built-in default path Co-authored-by: jphein <jphein@users.noreply.github.com> --------- Co-authored-by: jp <jp@jphein.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: jphein <jphein@users.noreply.github.com>
2026-04-11 21:25:04 -07:00
parent 58eca5075a
commit 20c8f8e57b
9 changed files with 1429 additions and 164 deletions
@@ -6,7 +6,7 @@ Supported:
    - Plain text with > markers (pass through)
    - Claude.ai JSON export
    - ChatGPT conversations.json
-    - Claude Code JSONL
+    - Claude Code JSONL (with tool_use/tool_result block capture)
    - OpenAI Codex CLI JSONL
    - Slack JSON export
    - Plain text (pass through for paragraph chunking)
@@ -83,6 +83,8 @@ def _try_claude_code_jsonl(content: str) -> Optional[str]:
    """Claude Code JSONL sessions."""
    lines = [line.strip() for line in content.strip().split("\n") if line.strip()]
    messages = []
+    tool_use_map = {}  # tool_use_id → tool_name
+
    for line in lines:
        try:
            entry = json.loads(line)
@@ -92,14 +94,46 @@ def _try_claude_code_jsonl(content: str) -> Optional[str]:
            continue
        msg_type = entry.get("type", "")
        message = entry.get("message", {})
+        if not isinstance(message, dict):
+            continue
+        msg_content = message.get("content", "")
+
+        # Build tool_use_map from assistant messages
+        if msg_type == "assistant" and isinstance(msg_content, list):
+            for block in msg_content:
+                if isinstance(block, dict) and block.get("type") == "tool_use":
+                    tool_id = block.get("id", "")
+                    if tool_id:
+                        tool_use_map[tool_id] = block.get("name", "Unknown")
+
        if msg_type in ("human", "user"):
-            text = _extract_content(message.get("content", ""))
+            # Check if this message is tool_results only (no user text)
+            is_tool_only = (
+                isinstance(msg_content, list)
+                and all(
+                    isinstance(b, dict) and b.get("type") == "tool_result"
+                    for b in msg_content
+                )
+            )
+            text = _extract_content(msg_content, tool_use_map=tool_use_map)
            if text:
-                messages.append(("user", text))
+                if is_tool_only and messages and messages[-1][0] == "assistant":
+                    # Append tool results to the previous assistant message
+                    prev_role, prev_text = messages[-1]
+                    messages[-1] = (prev_role, prev_text + "\n" + text)
+                elif not is_tool_only:
+                    messages.append(("user", text))
        elif msg_type == "assistant":
-            text = _extract_content(message.get("content", ""))
+            text = _extract_content(msg_content, tool_use_map=tool_use_map)
            if text:
-                messages.append(("assistant", text))
+                # If previous message is also assistant (multi-turn tool loop),
+                # merge into the same assistant turn
+                if messages and messages[-1][0] == "assistant":
+                    prev_role, prev_text = messages[-1]
+                    messages[-1] = (prev_role, prev_text + "\n" + text)
+                else:
+                    messages.append(("assistant", text))
+
    if len(messages) >= 2:
        return _messages_to_transcript(messages)
    return None
@@ -270,8 +304,14 @@ def _try_slack_json(data) -> Optional[str]:
    return None


-def _extract_content(content) -> str:
-    """Pull text from content — handles str, list of blocks, or dict."""
+def _extract_content(content, tool_use_map: dict = None) -> str:
+    """Pull text from content — handles str, list of blocks, or dict.
+
+    Args:
+        content: Message content — string, list of content blocks, or dict.
+        tool_use_map: Optional mapping of tool_use_id → tool_name, used to
+                      select the right formatting strategy for tool_result blocks.
+    """
    if isinstance(content, str):
        return content.strip()
    if isinstance(content, list):
@@ -279,14 +319,133 @@ def _extract_content(content) -> str:
        for item in content:
            if isinstance(item, str):
                parts.append(item)
-            elif isinstance(item, dict) and item.get("type") == "text":
-                parts.append(item.get("text", ""))
-        return " ".join(parts).strip()
+            elif isinstance(item, dict):
+                block_type = item.get("type")
+                if block_type == "text":
+                    parts.append(item.get("text", ""))
+                elif block_type == "tool_use":
+                    parts.append(_format_tool_use(item))
+                elif block_type == "tool_result":
+                    tid = item.get("tool_use_id", "")
+                    tname = (tool_use_map or {}).get(tid, "Unknown")
+                    result_content = item.get("content", "")
+                    formatted = _format_tool_result(result_content, tname)
+                    if formatted:
+                        parts.append(formatted)
+        return "\n".join(p for p in parts if p).strip()
    if isinstance(content, dict):
        return content.get("text", "").strip()
    return ""


+def _format_tool_use(block: dict) -> str:
+    """Format a tool_use block into a human-readable one-liner."""
+    name = block.get("name", "Unknown")
+    inp = block.get("input", {})
+
+    if name == "Bash":
+        cmd = inp.get("command", "")
+        if len(cmd) > 200:
+            cmd = cmd[:200] + "..."
+        return f"[Bash] {cmd}"
+
+    if name == "Read":
+        path = inp.get("file_path", "?")
+        offset = inp.get("offset")
+        limit = inp.get("limit")
+        if offset is not None and limit is not None:
+            try:
+                return f"[Read {path}:{offset}-{int(offset) + int(limit)}]"
+            except (ValueError, TypeError):
+                return f"[Read {path}:{offset}+{limit}]"
+        return f"[Read {path}]"
+
+    if name == "Grep":
+        pattern = inp.get("pattern", "")
+        target = inp.get("path") or inp.get("glob") or ""
+        return f"[Grep] {pattern} in {target}"
+
+    if name == "Glob":
+        pattern = inp.get("pattern", "")
+        return f"[Glob] {pattern}"
+
+    if name in ("Edit", "Write"):
+        path = inp.get("file_path", "?")
+        return f"[{name} {path}]"
+
+    # Unknown tool — serialize input, truncate
+    summary = json.dumps(inp, separators=(",", ":"))
+    if len(summary) > 200:
+        summary = summary[:200] + "..."
+    return f"[{name}] {summary}"
+
+
+_TOOL_RESULT_MAX_LINES_BASH = 20  # head and tail line count
+_TOOL_RESULT_MAX_MATCHES = 20     # Grep/Glob cap
+_TOOL_RESULT_MAX_BYTES = 2048     # fallback cap for unknown tools
+
+
+def _format_tool_result(content, tool_name: str) -> str:
+    """Format a tool_result based on the originating tool's type.
+
+    Args:
+        content: Result text (str) or list of content blocks (list of dicts).
+        tool_name: Name of the tool that produced this result.
+
+    Returns:
+        Formatted string prefixed with ``→ ``, or empty string if omitted.
+    """
+    # Normalize list-of-blocks to plain text
+    if isinstance(content, list):
+        parts = []
+        for item in content:
+            if isinstance(item, dict) and item.get("type") == "text":
+                parts.append(item.get("text", ""))
+            elif isinstance(item, str):
+                parts.append(item)
+        text = "\n".join(parts)
+    else:
+        text = str(content) if content else ""
+
+    text = text.strip()
+    if not text:
+        return ""
+
+    # Read/Edit/Write — omit result (content is in palace or git)
+    if tool_name in ("Read", "Edit", "Write"):
+        return ""
+
+    lines = text.split("\n")
+
+    # Bash — head + tail
+    if tool_name == "Bash":
+        n = _TOOL_RESULT_MAX_LINES_BASH
+        if len(lines) <= n * 2:
+            return "→ " + "\n→ ".join(lines)
+        head = lines[:n]
+        tail = lines[-n:]
+        omitted = len(lines) - 2 * n
+        return (
+            "→ " + "\n→ ".join(head)
+            + f"\n→ ... [{omitted} lines omitted] ..."
+            + "\n→ " + "\n→ ".join(tail)
+        )
+
+    # Grep/Glob — cap matches
+    if tool_name in ("Grep", "Glob"):
+        cap = _TOOL_RESULT_MAX_MATCHES
+        if len(lines) <= cap:
+            return "→ " + "\n→ ".join(lines)
+        kept = lines[:cap]
+        remaining = len(lines) - cap
+        return "→ " + "\n→ ".join(kept) + f"\n→ ... [{remaining} more matches]"
+
+    # Unknown — byte cap
+    if len(text) > _TOOL_RESULT_MAX_BYTES:
+        return "→ " + text[:_TOOL_RESULT_MAX_BYTES] + f"... [truncated, {len(text)} chars]"
+    return "→ " + text
+
+
 def _messages_to_transcript(messages: list, spellcheck: bool = True) -> str:
    """Convert [(role, text), ...] to transcript format with > markers."""
    if spellcheck: