mempalace

Author	SHA1	Message	Date
Tal Muskal	3d00a93655	feat: add MemPalace Claude Code plugin with hooks and instructions - Introduced README.md for plugin overview and installation instructions. - Added hooks configuration in hooks.json for auto-save and pre-compact functionality. - Implemented stop and pre-compact hooks in bash scripts for memory management. - Created marketplace.json and plugin.json for plugin metadata and versioning. - Developed skills and instructions for help, init, mine, search, and status functionalities. - Added CLI commands for executing hooks and displaying skill instructions. - Implemented hooks_cli.py for handling hook logic and JSON input/output. - Enhanced instruction files for user guidance on setup and usage. - Updated .gitignore to exclude additional files. - Created GitHub Actions workflow for syncing plugin version on push.	2026-04-08 14:55:46 +03:00
marerem	df33550945	fix: silence ChromaDB telemetry warnings and CoreML segfault on Apple Silicon ChromaDB 0.6.x bundles a Posthog telemetry client whose capture() signature is incompatible with the installed posthog library, producing noisy "Failed to send telemetry event" stderr warnings on every operation. Silence by raising the logger threshold to CRITICAL. ONNX Runtime's CoreML execution provider segfaults during vector queries on macOS ARM64 (issue #74). Auto-set ORT_DISABLE_COREML=1 on Apple Silicon to force CPU execution, while respecting any user-provided override via os.environ.setdefault(). Made-with: Cursor	2026-04-08 12:43:09 +02:00
Igor Lins e Silva	a67b00d7c7	perf: cache ChromaDB PersistentClient instead of re-instantiating per call The MCP server previously created a new PersistentClient on every tool call via _get_collection(). This incurs HNSW index loading overhead on each request. Cache the client and collection at module level. The cache resets naturally on process restart (MCP runs as a subprocess). Also adds a _reset_mcp_cache fixture to conftest.py for test isolation. Includes test infrastructure from PR #131. 92 tests pass.	2026-04-08 04:39:19 -03:00
adv3nt3	75eb7ff871	fix: use actual detected room in mine summary stats process_file() now returns (drawer_count, room) instead of just drawer_count. The mine summary uses the returned room directly instead of re-calling detect_room with empty content, which produced wrong stats when routing relied on content keywords.	2026-04-08 01:28:33 +02:00
Ben Sigman	71736a3f4f	Merge pull request #142 from igorls/chore/packaging-cleanup chore: tighten chromadb version range and add py.typed marker	2026-04-07 16:05:20 -07:00
Ben Sigman	c30bc9e71e	Merge pull request #137 from igorls/fix/bounded-queries fix: add limit=10000 safety cap to all unbounded ChromaDB .get() calls	2026-04-07 16:05:17 -07:00
Ben Sigman	a8de2911e5	Merge pull request #136 from igorls/fix/kg-hardening fix: enable SQLite WAL mode and add consistent LIMIT to KG timeline	2026-04-07 16:05:13 -07:00
Igor Lins e Silva	541e9bd1ee	chore: tighten chromadb version range and add py.typed marker - Tighten chromadb dependency from >=0.4.0,<1 to >=0.5.0,<0.7 (the collection API changed significantly across majors; this pins to the tested range) - Add optional 'spellcheck' extras for the undeclared autocorrect dependency used in spellcheck.py - Add PEP 561 py.typed marker for type checker support Findings: #10 (HIGH — chromadb range too wide), #30 (LOW — undeclared autocorrect), #32 (LOW — missing py.typed) Includes test infrastructure from PR #131. 92 tests pass.	2026-04-07 18:51:42 -03:00
Igor Lins e Silva	5ac4947d02	fix: preserve CLI exit codes, log tracebacks, sanitize search errors, validate fixture	2026-04-07 18:26:39 -03:00
Igor Lins e Silva	21f2248a3c	fix: enable SQLite WAL mode and add consistent LIMIT to KG timeline - Enable WAL journal mode in _conn() for better concurrent read performance and reduced SQLITE_BUSY risk - Add LIMIT 100 to entity-filtered timeline query (was unbounded, while global timeline already had LIMIT 100) Findings: #8 (HIGH — no WAL mode), #22 (LOW — inconsistent limits) Includes test infrastructure from PR #131. 92 tests pass.	2026-04-07 18:25:57 -03:00
Igor Lins e Silva	c9135aad67	fix: sanitize error responses and remove sys.exit from library code - Remove palace_path from _no_palace() error response (prevents leaking filesystem paths to the LLM) - Replace str(e) with generic 'Internal tool error' in MCP dispatch catch block (full error is still logged server-side via stderr) - Replace sys.exit(1) with return in searcher.search() CLI function (prevents process termination if called from library context) - Remove unused sys import from searcher.py Findings: #12 (HIGH), #5 (MEDIUM), #15 (LOW) Includes test infrastructure from PR #131. 92 tests pass.	2026-04-07 18:24:10 -03:00
Igor Lins e Silva	161a0d12a2	fix: cap diary_read query and update stale comment	2026-04-07 18:23:30 -03:00
Igor Lins e Silva	9491ffa92b	fix: add limit=10000 safety cap to all unbounded ChromaDB .get() calls Prevents OOM when the palace grows large. The following unbounded metadata fetches now have a safety cap: - tool_status: col.get(include=['metadatas'], limit=10000) - tool_list_wings: same - tool_list_rooms: same (including wing-filtered variant) - tool_get_taxonomy: same - Layer1.generate: col.get(include=['documents','metadatas'], limit=10000) Layer2 already had a limit parameter — no change needed. Finding: #3 (CRITICAL — unbounded data fetching causes OOM) Includes test infrastructure from PR #131. 92 tests pass.	2026-04-07 18:23:12 -03:00
bensig	39c14be113	fix: honest AAAK stats — word-based token estimator, lossy labels - Replace len(text)//3 token heuristic with word-based estimate (~1.3 tokens/word) - Old heuristic inflated compression ratios by ~3-5x - Update docstrings: "compression" → "lossy summarization" - Update module docstring to clarify AAAK is NOT lossless - compression_stats() now returns honest field names and a note - CLI output labels ratios as lossy Fixes #43	2026-04-07 14:14:31 -07:00
bensig	71fb66d687	fix: room detection checks keywords against folder paths detect_room() now matches folder path parts against room keywords, not just the room name. Fixes docs/ files routing to general instead of documentation room — "docs" wasn't a substring of "documentation" but is now matched via the persisted keywords list. Found during end-to-end testing after merging #108 keyword persistence.	2026-04-07 14:06:56 -07:00
Ben Sigman	3068f75c2d	Merge pull request #22 from sheetsync/bugfix/split-known-names-loading refactor: consolidate split known-names config loading	2026-04-07 13:58:54 -07:00
Ben Sigman	a59df81611	Merge pull request #66 from MARUCIE/fix/sqlite-batch-reads fix: batch ChromaDB reads to avoid SQLite variable limit	2026-04-07 13:58:52 -07:00
Ben Sigman	cea34366f5	Merge pull request #84 from AlexeySamosadov/fix/mcp-integer-coercion fix: coerce MCP integer arguments to native Python int	2026-04-07 13:58:49 -07:00
Ben Sigman	0b0e123f42	Merge pull request #61 from adv3nt3/feat/codex-cli-normalizer feat: add OpenAI Codex CLI JSONL normalizer	2026-04-07 13:54:08 -07:00
Ben Sigman	6af6fe3dda	Merge pull request #54 from adv3nt3/fix/narrow-exception-handling fix: narrow bare except Exception to specific types where safe	2026-04-07 13:54:05 -07:00
Ben Sigman	e8f9b47e31	Merge pull request #16 from sheetsync/bugfix/version-consistency fix: unify package and MCP version reporting	2026-04-07 13:54:03 -07:00
Ben Sigman	1c62045b22	Merge pull request #21 from sheetsync/bugfix/mcp-docs-alignment docs: align MCP setup examples with shipped server	2026-04-07 13:54:00 -07:00
Ben Sigman	f1f0a4c966	Merge pull request #129 from minimexat/fix/windows-unicode-encoding fix: replace Unicode separator character for Windows compatibility	2026-04-07 13:51:58 -07:00
Ben Sigman	30699b85a9	Merge pull request #42 from adv3nt3/fix/entity-registry-dead-code fix: remove dead code and duplicate set items in entity_registry.py	2026-04-07 13:51:55 -07:00
Ben Sigman	6aa4272b65	Merge pull request #53 from adv3nt3/fix/md5-usedforsecurity-miners fix: mark MD5 as non-security in miner drawer ID generation	2026-04-07 13:51:52 -07:00
Ben Sigman	3e6fc6ed9f	Merge pull request #83 from renatoliveira/main fix: update input prompt for entity confirmation in entity_detector.py	2026-04-07 13:51:50 -07:00
f-hoedl	d214f6a854	fix: replace Unicode separator in convo_miner.py for Windows compatibility Replace the ─ (U+2500) separator character with - in convo_miner.py. Windows terminals using cp1252 encoding raise UnicodeEncodeError when printing this character unless PYTHONUTF8=1 is set explicitly. Fixes crash on Windows: UnicodeEncodeError: 'charmap' codec can't encode character '\u2500'	2026-04-07 21:55:34 +02:00
bensig	caa1169f04	fix: --yes flag now skips room confirmation in init Pass yes flag through to detect_rooms_local so init --yes skips both entity detection AND room approval prompts. Agents and CI can now run init without interactive input. Fixes #8	2026-04-07 12:16:46 -07:00
Ben Sigman	01a21dd60f	Merge pull request #78 from ac-opensource/feature/respect-gitignore-mining Respect nested .gitignore rules when mining project files	2026-04-07 12:15:23 -07:00
bensig	5e8a039e7c	fix: repair command, split args, Claude export, room keywords - Add `mempalace repair` command to rebuild vector index from SQLite when HNSW files are corrupted after crash/interrupt (fixes #74, #72, #96) - Fix split command passing dir as positional instead of --source flag to split_mega_files (fixes #63) - Handle Claude privacy export format (array of conversation objects with chat_messages inside each) in normalize.py (fixes #63) - Persist room keywords in mempalace.yaml so mine can match files in docs/ to room "documentation" (fixes #108)	2026-04-07 12:02:34 -07:00
bensig	186bb2e3d1	fix: shell injection in hooks, Claude Code mining, chromadb pin - hooks/mempal_save_hook.sh: pass $TRANSCRIPT_PATH as sys.argv instead of interpolating into python -c string (fixes #110) - normalize.py: accept type "user" in addition to "human" for Claude Code JSONL sessions (fixes #111) - convo_miner.py: skip tool-results/, memory/ dirs and .meta.json files when scanning for conversations (fixes #111) - pyproject.toml: pin chromadb>=0.4.0,<1 to avoid crashing 1.x builds on macOS ARM64 (fixes #100)	2026-04-07 11:45:51 -07:00
ac-opensource	c8c220d789	fix: support nested .gitignore rules during mining	2026-04-08 00:02:21 +08:00
Alexey Samosadov	8fbb6178dd	fix: coerce MCP integer arguments to native Python int ChromaDB requires native `int` for `n_results`, but the MCP JSON-RPC transport can deliver JSON integers as floats or strings depending on the client implementation. This causes `mempalace_search` (and any tool with integer params like `max_hops`, `last_n`) to fail with: "Expected requested number of results to be a int, got 3 in query." Fix: auto-coerce tool arguments based on the declared `input_schema` types before calling handlers. This covers all current and future tools generically. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-07 17:48:03 +03:00
Renato Oliveira	cfe878204e	fix: update input prompt for entity confirmation in entity_detector.py Refine the prompt for distinguishing between person and project entities by adjusting the wording for clarity.	2026-04-07 11:41:15 -03:00
ac-opensource	9b9daa9b4b	fix: respect .gitignore during project mining	2026-04-07 22:26:06 +08:00
Maurice Wen	0e77981dec	fix: batch ChromaDB reads to avoid SQLite variable limit col.get() without limit generates SELECT ... WHERE id IN (...) with all document IDs, which exceeds SQLite's ~999 variable limit when a palace has more than ~1000 drawers. This breaks both `mempalace compress` and `mempalace wake-up` on large palaces. Reproduced on a 13880-file codebase (242K+ drawers). Fix: paginate reads in batches of 500 using ChromaDB's offset/limit parameters in both Layer1.generate() and cmd_compress().	2026-04-07 21:40:12 +08:00
adv3nt3	d4e1945f77	feat: add OpenAI Codex CLI JSONL normalizer Add _try_codex_jsonl parser for Codex CLI session files stored at ~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl. Uses only event_msg entries (user_message / agent_message) which represent the canonical conversation turns. response_item entries are intentionally skipped — they include synthetic context injections (environment_context) and can duplicate real messages when both representations are present in the same rollout. Format based on Codex source tests (codex-rs/rollout/src/recorder_tests.rs). Requires session_meta header to reduce false positives on other JSONL. Refs: #59	2026-04-07 14:50:04 +02:00
adv3nt3	312d380aab	fix: narrow bare except Exception to specific types where safe Replace broad except Exception with specific exception types in 6 sites where the expected failure mode is well-defined: - normalize.py: OSError for file read, ImportError for optional import - miner.py: OSError for file read_text - entity_detector.py: OSError for file read in scan loop - convo_miner.py: (OSError, ValueError) for normalize which reads and parses files - entity_registry.py: (URLError, OSError, JSONDecodeError, KeyError) for Wikipedia lookup fallback ChromaDB except Exception sites (~30) are left broad for now. chromadb.errors defines NotFoundError, DuplicateIDError, InvalidDimensionException etc., but narrowing those sites requires importing from chromadb.errors and validating across supported versions (>=0.4.0). MCP server handlers also left broad for resilience.	2026-04-07 13:51:27 +02:00
adv3nt3	3a2817505a	fix: mark MD5 as non-security in miner drawer ID generation Add usedforsecurity=False to hashlib.md5() calls in miner.py and convo_miner.py to document that MD5 is used for deterministic ID generation, not cryptographic security. Preserves stable drawer IDs for backward compatibility with existing palaces. Swapping to SHA-256 would change the ID formula and make existing drawers unreachable on re-ingestion. PR #34 covers the MD5 sites in knowledge_graph.py and mcp_server.py. Verified: usedforsecurity kwarg is supported since Python 3.9 (project target per pyproject.toml line 10), confirmed via Context7 CPython docs.	2026-04-07 13:41:00 +02:00
adv3nt3	3c78e2fbb5	fix: remove dead code and duplicate set items in entity_registry.py Remove discarded `query.lower()` call in `extract_people_from_query` — strings are immutable so the result was always thrown away. The existing `re.IGNORECASE` flag already handles case-insensitive matching. Remove duplicate literals in COMMON_ENGLISH_WORDS set: "hunter" (consecutive duplicate), "april" and "june" (appeared in both names and months sections).	2026-04-07 13:00:59 +02:00
James Cane	0808ad96c2	refactor: consolidate split known-names config loading	2026-04-07 09:16:07 +01:00
James Cane	1557eaa2f5	docs: align MCP setup examples with shipped server	2026-04-07 09:15:16 +01:00
James Cane	55152ce476	fix: unify package and MCP version reporting	2026-04-07 08:53:25 +01:00
bensig	6d8c462219	fix: resolve ruff lint and format errors across codebase Fix E402 import ordering, F841 unused variable, F541 unnecessary f-strings, F401 unused import, and auto-format 6 files.	2026-04-04 18:37:17 -07:00
Milla Jovovich	068dbd9a7b	MemPalace: palace architecture, AAAK compression, knowledge graph The memory system: - Palace structure: Wings (people/projects) → Rooms (topics) → Closets (AAAK compressed) → Drawers (verbatim transcripts) - Halls connect related rooms within a wing - Tunnels cross-reference rooms across wings - AAAK: 30x lossless compression dialect for AI agents - Knowledge graph: temporal entity-relationship triples (SQLite) - Palace graph: room-based navigation with tunnel detection - MCP server: 19 tools — search, graph traversal, agent diary, AAAK auto-teach - Onboarding: guided setup generates wing config + AAAK entity registry - Contradiction detection: catches wrong pronouns, names, ages - Auto-save hooks for Claude Code 96.6% Recall@5 on LongMemEval — highest zero-API score published. 100% with optional Haiku rerank (500/500). Local. Free. No API key required.	2026-04-04 18:16:04 -07:00

1 2

95 Commits