- Switch CI install step from `pip install -r requirements.txt` to
`pip install -e ".[dev]"` since requirements.txt was removed
- Add noqa: E402 to intentionally-late imports in conftest.py
(HOME must be isolated before mempalace imports)
- Remove unused KnowledgeGraph import in test_knowledge_graph.py
- Apply ruff formatting to test files
Replace the ─ (U+2500) separator character with - in convo_miner.py.
Windows terminals using cp1252 encoding raise UnicodeEncodeError when
printing this character unless PYTHONUTF8=1 is set explicitly.
Fixes crash on Windows: UnicodeEncodeError: 'charmap' codec can't encode
character '\u2500'
Pass yes flag through to detect_rooms_local so init --yes
skips both entity detection AND room approval prompts.
Agents and CI can now run init without interactive input.
Fixes#8
- Add `mempalace repair` command to rebuild vector index from SQLite
when HNSW files are corrupted after crash/interrupt (fixes#74, #72, #96)
- Fix split command passing dir as positional instead of --source
flag to split_mega_files (fixes#63)
- Handle Claude privacy export format (array of conversation objects
with chat_messages inside each) in normalize.py (fixes#63)
- Persist room keywords in mempalace.yaml so mine can match files
in docs/ to room "documentation" (fixes#108)
- hooks/mempal_save_hook.sh: pass $TRANSCRIPT_PATH as sys.argv
instead of interpolating into python -c string (fixes#110)
- normalize.py: accept type "user" in addition to "human" for
Claude Code JSONL sessions (fixes#111)
- convo_miner.py: skip tool-results/, memory/ dirs and .meta.json
files when scanning for conversations (fixes#111)
- pyproject.toml: pin chromadb>=0.4.0,<1 to avoid crashing 1.x
builds on macOS ARM64 (fixes#100)
The community caught real problems within hours of launch. Addressing them
directly:
- Added prominent "A Note from Milla & Ben" section at top owning the issues
- Fixed AAAK section: removed "lossless" claim, removed bogus token example,
honest about lossy nature and 84.2% regression on LongMemEval
- Headline benchmark table: clearly labeled as "raw mode" (the 96.6% number)
- Removed misleading "100%" headline (still real but rerank pipeline not
in public scripts yet — addressing)
- Removed misleading "+34% palace boost" headline (it's metadata filtering,
real but not a novel mechanism)
- Marked Contradiction Detection as "experimental, not yet wired into KG ops"
- Closet legend now notes plain-text summaries in v3.0.0, AAAK closets coming
- Intro pillars rewritten honestly — raw verbatim is the win, AAAK is
experimental compression layer
Thank you to @panuhorsmalahti (#43), @lhl (#27), @gizmax (#39) and everyone
who filed issues in the first 48 hours. Brutal honest criticism is exactly
what makes open source work.
Add _try_codex_jsonl parser for Codex CLI session files stored at
~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl.
Uses only event_msg entries (user_message / agent_message) which
represent the canonical conversation turns. response_item entries
are intentionally skipped — they include synthetic context injections
(environment_context) and can duplicate real messages when both
representations are present in the same rollout.
Format based on Codex source tests (codex-rs/rollout/src/recorder_tests.rs).
Requires session_meta header to reduce false positives on other JSONL.
Refs: #59
Replace broad except Exception with specific exception types in 6
sites where the expected failure mode is well-defined:
- normalize.py: OSError for file read, ImportError for optional import
- miner.py: OSError for file read_text
- entity_detector.py: OSError for file read in scan loop
- convo_miner.py: (OSError, ValueError) for normalize which reads
and parses files
- entity_registry.py: (URLError, OSError, JSONDecodeError, KeyError)
for Wikipedia lookup fallback
ChromaDB except Exception sites (~30) are left broad for now.
chromadb.errors defines NotFoundError, DuplicateIDError,
InvalidDimensionException etc., but narrowing those sites requires
importing from chromadb.errors and validating across supported
versions (>=0.4.0). MCP server handlers also left broad for
resilience.
Add usedforsecurity=False to hashlib.md5() calls in miner.py and
convo_miner.py to document that MD5 is used for deterministic ID
generation, not cryptographic security. Preserves stable drawer IDs
for backward compatibility with existing palaces.
Swapping to SHA-256 would change the ID formula and make existing
drawers unreachable on re-ingestion. PR #34 covers the MD5 sites
in knowledge_graph.py and mcp_server.py.
Verified: usedforsecurity kwarg is supported since Python 3.9
(project target per pyproject.toml line 10), confirmed via Context7
CPython docs.
Remove discarded `query.lower()` call in `extract_people_from_query` —
strings are immutable so the result was always thrown away. The existing
`re.IGNORECASE` flag already handles case-insensitive matching.
Remove duplicate literals in COMMON_ENGLISH_WORDS set: "hunter" (consecutive
duplicate), "april" and "june" (appeared in both names and months sections).
Add Milla's conversational explanation of the palace architecture
(wings → rooms → closets → drawers → halls → tunnels) as an
introductory section before the technical diagram.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>