- Introduced README.md for plugin overview and installation instructions.
- Added hooks configuration in hooks.json for auto-save and pre-compact functionality.
- Implemented stop and pre-compact hooks in bash scripts for memory management.
- Created marketplace.json and plugin.json for plugin metadata and versioning.
- Developed skills and instructions for help, init, mine, search, and status functionalities.
- Added CLI commands for executing hooks and displaying skill instructions.
- Implemented hooks_cli.py for handling hook logic and JSON input/output.
- Enhanced instruction files for user guidance on setup and usage.
- Updated .gitignore to exclude additional files.
- Created GitHub Actions workflow for syncing plugin version on push.
ChromaDB 0.6.x bundles a Posthog telemetry client whose capture()
signature is incompatible with the installed posthog library, producing
noisy "Failed to send telemetry event" stderr warnings on every
operation. Silence by raising the logger threshold to CRITICAL.
ONNX Runtime's CoreML execution provider segfaults during vector
queries on macOS ARM64 (issue #74). Auto-set ORT_DISABLE_COREML=1
on Apple Silicon to force CPU execution, while respecting any
user-provided override via os.environ.setdefault().
Made-with: Cursor
The MCP server previously created a new PersistentClient on every tool
call via _get_collection(). This incurs HNSW index loading overhead
on each request.
Cache the client and collection at module level. The cache resets
naturally on process restart (MCP runs as a subprocess).
Also adds a _reset_mcp_cache fixture to conftest.py for test isolation.
Includes test infrastructure from PR #131.
92 tests pass.
process_file() now returns (drawer_count, room) instead of just
drawer_count. The mine summary uses the returned room directly
instead of re-calling detect_room with empty content, which
produced wrong stats when routing relied on content keywords.
- Tighten chromadb dependency from >=0.4.0,<1 to >=0.5.0,<0.7
(the collection API changed significantly across majors; this
pins to the tested range)
- Add optional 'spellcheck' extras for the undeclared autocorrect
dependency used in spellcheck.py
- Add PEP 561 py.typed marker for type checker support
Findings: #10 (HIGH — chromadb range too wide), #30 (LOW — undeclared
autocorrect), #32 (LOW — missing py.typed)
Includes test infrastructure from PR #131.
92 tests pass.
- Enable WAL journal mode in _conn() for better concurrent read
performance and reduced SQLITE_BUSY risk
- Add LIMIT 100 to entity-filtered timeline query (was unbounded,
while global timeline already had LIMIT 100)
Findings: #8 (HIGH — no WAL mode), #22 (LOW — inconsistent limits)
Includes test infrastructure from PR #131.
92 tests pass.
- Remove palace_path from _no_palace() error response (prevents
leaking filesystem paths to the LLM)
- Replace str(e) with generic 'Internal tool error' in MCP dispatch
catch block (full error is still logged server-side via stderr)
- Replace sys.exit(1) with return in searcher.search() CLI function
(prevents process termination if called from library context)
- Remove unused sys import from searcher.py
Findings: #12 (HIGH), #5 (MEDIUM), #15 (LOW)
Includes test infrastructure from PR #131.
92 tests pass.
Prevents OOM when the palace grows large. The following unbounded
metadata fetches now have a safety cap:
- tool_status: col.get(include=['metadatas'], limit=10000)
- tool_list_wings: same
- tool_list_rooms: same (including wing-filtered variant)
- tool_get_taxonomy: same
- Layer1.generate: col.get(include=['documents','metadatas'], limit=10000)
Layer2 already had a limit parameter — no change needed.
Finding: #3 (CRITICAL — unbounded data fetching causes OOM)
Includes test infrastructure from PR #131.
92 tests pass.
- Replace len(text)//3 token heuristic with word-based estimate (~1.3 tokens/word)
- Old heuristic inflated compression ratios by ~3-5x
- Update docstrings: "compression" → "lossy summarization"
- Update module docstring to clarify AAAK is NOT lossless
- compression_stats() now returns honest field names and a note
- CLI output labels ratios as lossy
Fixes#43
detect_room() now matches folder path parts against room keywords,
not just the room name. Fixes docs/ files routing to general instead
of documentation room — "docs" wasn't a substring of "documentation"
but is now matched via the persisted keywords list.
Found during end-to-end testing after merging #108 keyword persistence.
Replace the ─ (U+2500) separator character with - in convo_miner.py.
Windows terminals using cp1252 encoding raise UnicodeEncodeError when
printing this character unless PYTHONUTF8=1 is set explicitly.
Fixes crash on Windows: UnicodeEncodeError: 'charmap' codec can't encode
character '\u2500'
Pass yes flag through to detect_rooms_local so init --yes
skips both entity detection AND room approval prompts.
Agents and CI can now run init without interactive input.
Fixes#8
- Add `mempalace repair` command to rebuild vector index from SQLite
when HNSW files are corrupted after crash/interrupt (fixes#74, #72, #96)
- Fix split command passing dir as positional instead of --source
flag to split_mega_files (fixes#63)
- Handle Claude privacy export format (array of conversation objects
with chat_messages inside each) in normalize.py (fixes#63)
- Persist room keywords in mempalace.yaml so mine can match files
in docs/ to room "documentation" (fixes#108)
- hooks/mempal_save_hook.sh: pass $TRANSCRIPT_PATH as sys.argv
instead of interpolating into python -c string (fixes#110)
- normalize.py: accept type "user" in addition to "human" for
Claude Code JSONL sessions (fixes#111)
- convo_miner.py: skip tool-results/, memory/ dirs and .meta.json
files when scanning for conversations (fixes#111)
- pyproject.toml: pin chromadb>=0.4.0,<1 to avoid crashing 1.x
builds on macOS ARM64 (fixes#100)
ChromaDB requires native `int` for `n_results`, but the MCP JSON-RPC
transport can deliver JSON integers as floats or strings depending on
the client implementation. This causes `mempalace_search` (and any
tool with integer params like `max_hops`, `last_n`) to fail with:
"Expected requested number of results to be a int, got 3 in query."
Fix: auto-coerce tool arguments based on the declared `input_schema`
types before calling handlers. This covers all current and future
tools generically.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
col.get() without limit generates SELECT ... WHERE id IN (...) with all
document IDs, which exceeds SQLite's ~999 variable limit when a palace
has more than ~1000 drawers.
This breaks both `mempalace compress` and `mempalace wake-up` on large
palaces. Reproduced on a 13880-file codebase (242K+ drawers).
Fix: paginate reads in batches of 500 using ChromaDB's offset/limit
parameters in both Layer1.generate() and cmd_compress().
Add _try_codex_jsonl parser for Codex CLI session files stored at
~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl.
Uses only event_msg entries (user_message / agent_message) which
represent the canonical conversation turns. response_item entries
are intentionally skipped — they include synthetic context injections
(environment_context) and can duplicate real messages when both
representations are present in the same rollout.
Format based on Codex source tests (codex-rs/rollout/src/recorder_tests.rs).
Requires session_meta header to reduce false positives on other JSONL.
Refs: #59
Replace broad except Exception with specific exception types in 6
sites where the expected failure mode is well-defined:
- normalize.py: OSError for file read, ImportError for optional import
- miner.py: OSError for file read_text
- entity_detector.py: OSError for file read in scan loop
- convo_miner.py: (OSError, ValueError) for normalize which reads
and parses files
- entity_registry.py: (URLError, OSError, JSONDecodeError, KeyError)
for Wikipedia lookup fallback
ChromaDB except Exception sites (~30) are left broad for now.
chromadb.errors defines NotFoundError, DuplicateIDError,
InvalidDimensionException etc., but narrowing those sites requires
importing from chromadb.errors and validating across supported
versions (>=0.4.0). MCP server handlers also left broad for
resilience.
Add usedforsecurity=False to hashlib.md5() calls in miner.py and
convo_miner.py to document that MD5 is used for deterministic ID
generation, not cryptographic security. Preserves stable drawer IDs
for backward compatibility with existing palaces.
Swapping to SHA-256 would change the ID formula and make existing
drawers unreachable on re-ingestion. PR #34 covers the MD5 sites
in knowledge_graph.py and mcp_server.py.
Verified: usedforsecurity kwarg is supported since Python 3.9
(project target per pyproject.toml line 10), confirmed via Context7
CPython docs.
Remove discarded `query.lower()` call in `extract_people_from_query` —
strings are immutable so the result was always thrown away. The existing
`re.IGNORECASE` flag already handles case-insensitive matching.
Remove duplicate literals in COMMON_ENGLISH_WORDS set: "hunter" (consecutive
duplicate), "april" and "june" (appeared in both names and months sections).