mempalace

Author	SHA1	Message	Date
Ben Sigman	725fa2b6f1	Merge branch 'main' into fix/query-sanitizer-prompt-contamination	2026-04-09 08:11:39 -07:00
matrix9neonebuchadnezzar2199-sketch	7509a72502	fix: mitigate system prompt contamination in search queries (#333 ) Addresses Issue #333: AI agents prepending system prompts to search queries causes embedding retrieval to collapse (89.8% → 1.0% R@10). Mitigation approach (減災): - New query_sanitizer.py with 4-stage pipeline: Step 1: passthrough for short queries (≤200 chars) Step 2: question extraction (finds ? sentences) → ~85-89% recovery Step 3: tail sentence extraction → ~80-89% recovery Step 4: tail truncation fallback → ~70-80% recovery Worst case without sanitizer: 1.0% (catastrophic) Worst case with sanitizer: ~70-80% (survivable) - mcp_server.py: tool_search applies sanitizer before ChromaDB query - MCP schema: query description warns agents not to include prompts - New 'context' parameter separates background info from search intent - Sanitizer metadata included in response when triggered 22 new tests covering all pipeline stages and real-world scenarios. Made-with: Cursor	2026-04-09 23:28:59 +09:00
Tal Muskal	da64016a94	fix: format test_layers_bench.py with ruff to pass CI lint Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-09 08:24:51 +03:00
Ben Sigman	d26606b2f9	Merge branch 'main' into main	2026-04-08 14:07:33 -07:00
Igor Lins e Silva	c4e52954fe	Merge upstream/main into bench/scale-test-suite to resolve conflicts Merged both the PR's benchmark suite additions (psutil dep, pytest markers, --ignore=tests/benchmarks) and upstream's coverage changes (pytest-cov, --cov-fail-under=30, coverage config) so both coexist. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2026-04-08 16:28:06 -03:00
Tal Muskal	28de031f25	fix: remove stale palace_path reference in test helper _patch_mcp_server had palace_path removed from its signature but the assertion body still referenced it, causing NameError at runtime and F821 from ruff. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-08 22:07:46 +03:00
Tal Muskal	dbf456b73b	Merge branch 'main' into main	2026-04-08 22:02:50 +03:00
Tal Muskal	abd52534bb	test: bring coverage to 85%, set threshold to 85, reset version to 3.0.11 - Add tests for config, convo_miner, spellcheck, knowledge_graph - Fix Windows PermissionError in test cleanup (chromadb file locks) - Add UTF-8 encoding to split_mega_files, entity_registry, hooks_cli - Fix mcp_server parse_known_args logging for unknown args - Set coverage threshold to 85 in pyproject.toml and CI - Reset all version files to 3.0.11 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-08 21:38:12 +03:00
Igor Lins e Silva	a0bcd0c836	fix: ruff format test_hooks_cli.py and test_knowledge_graph.py	2026-04-08 15:12:12 -03:00
Igor Lins e Silva	af42a850f6	fix: split semicolon statements onto two lines for ruff E702	2026-04-08 15:11:55 -03:00
Igor Lins e Silva	bf88daa649	fix: address review — re-mine modified files, idempotent add_drawer, cleanup ChromaDB handles	2026-04-08 15:11:55 -03:00
Igor Lins e Silva	a4149ab248	fix: use upsert and deterministic IDs to prevent data stagnation MCP tool_add_drawer: - Make drawer_id content-based: hash full content instead of content[:100] + timestamp. Same content → same ID, eliminating TOCTOU race conditions - Switch from col.add() to col.upsert() so re-filing with updated content updates the existing drawer miner.add_drawer: - Switch from collection.add() to collection.upsert() so re-mining a modified file updates instead of silently failing - Remove the try/except catching 'already exists' — upsert handles this naturally Findings: #11 (HIGH — add ignores updates), #6 (MEDIUM — TOCTOU), #13 (MEDIUM — non-deterministic IDs) Includes test infrastructure from PR #131. 92 tests pass.	2026-04-08 15:11:55 -03:00
Tal Muskal	9ca70264f3	style: format test files with ruff Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-08 21:08:49 +03:00
Tal Muskal	e24d8ca733	test: expand coverage to 70%, fix mcp_server CI crash (threshold 60%) Add/expand tests for normalize (39%→97%), searcher (39%→100%), layers (28%→97%), split_mega_files (34%→72%). Fix mcp_server.py parse_args→parse_known_args to prevent SystemExit when imported during pytest (CI was crashing on all test jobs). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-08 21:07:03 +03:00
Tal Muskal	03e9b57108	test: add comprehensive test coverage (35% → 58%, threshold 50%) Add 180+ new tests across 10 test files covering previously untested modules: - instructions_cli (0% → 100%), hooks_cli (73% → 96%), spellcheck (28% → 84%) - palace_graph (9% → 91%), general_extractor (0% → 92%), entity_detector (0% → 69%) - entity_registry (0% → 70%), room_detector_local (0% → 55%), layers (0% → 28%) - onboarding (0% → 36%) Also fixes Windows encoding bug in onboarding.py (write_text without encoding="utf-8"). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-08 20:54:56 +03:00
Ben Sigman	59d011a23b	Merge pull request #270 from tmuskal/main Package MemPalace as standard Claude and Codex plugins with easy installation	2026-04-08 10:41:45 -07:00
Tal Muskal	9de302f881	feat: update README and CI configuration, add tests for hooks functionality	2026-04-08 20:40:03 +03:00
Igor Lins e Silva	ebc26f3960	fix: resolve formatting, regression logic, and pytest defaults - Run ruff format on all benchmark files (fixes CI lint job) - Fix check_regression() substring ambiguity: ordered keyword matching so "latency_improvement_pct" is correctly classified as higher-is-better - Update stale comments in conftest.py referencing wrong fixture - Add pytest addopts to skip benchmark/slow/stress markers by default	2026-04-08 10:56:39 -03:00
Igor Lins e Silva	7e4db33061	fix: resolve ruff lint errors in benchmark suite Remove unused imports (shutil, string, datetime, os, yaml, time, SCALE_CONFIGS) and unused variable assignments in timing-only calls.	2026-04-08 05:10:39 -03:00
Igor Lins e Silva	e8017ca2ec	bench: add per-room recall threshold test Concentrates all drawers into a single wing+room to isolate the embedding model's retrieval limit independent of palace filtering. Confirms recall degrades to ~0.4-0.5 at 5K drawers per room even with wing+room filters applied — the spatial structure helps by keeping buckets small, but can't fix the underlying embedding ceiling.	2026-04-08 05:06:31 -03:00
Igor Lins e Silva	7b89291334	bench: add scale benchmark suite (94 tests) Benchmark mempalace at configurable scale (1K–100K drawers) to find real-world performance limits. Tests cover MCP tool OOM thresholds, ChromaDB query degradation, search recall@k, mining throughput, knowledge graph concurrency, memory leak detection, palace boost quantification, and Layer1 unbounded fetch behavior. - tests/benchmarks/ with 8 test modules + data generator + report system - Deterministic data factory with planted needles for recall measurement - JSON report output with regression detection (--bench-report flag) - CI benchmark job on PRs at small scale - psutil added as dev dependency for RSS tracking	2026-04-08 05:06:31 -03:00
Igor Lins e Silva	47696bef8c	fix: address Copilot review — derive MCP version, improve test isolation and portability	2026-04-08 04:41:03 -03:00
Igor Lins e Silva	a67b00d7c7	perf: cache ChromaDB PersistentClient instead of re-instantiating per call The MCP server previously created a new PersistentClient on every tool call via _get_collection(). This incurs HNSW index loading overhead on each request. Cache the client and collection at module level. The cache resets naturally on process restart (MCP runs as a subprocess). Also adds a _reset_mcp_cache fixture to conftest.py for test isolation. Includes test infrastructure from PR #131. 92 tests pass.	2026-04-08 04:39:19 -03:00
Ben Sigman	a8de2911e5	Merge pull request #136 from igorls/fix/kg-hardening fix: enable SQLite WAL mode and add consistent LIMIT to KG timeline	2026-04-07 16:05:13 -07:00
Igor Lins e Silva	d3145e9a7b	fix: update dialect tests for PR #147 stats API and remove unused fixture param	2026-04-07 18:58:25 -03:00
Igor Lins e Silva	6fa985eac2	fix: update dialect tests for PR #147 stats API and remove unused fixture param	2026-04-07 18:58:20 -03:00
Igor Lins e Silva	b45bff9db1	test: add WAL mode and entity timeline limit assertions	2026-04-07 18:27:19 -03:00
Igor Lins e Silva	5ac4947d02	fix: preserve CLI exit codes, log tracebacks, sanitize search errors, validate fixture	2026-04-07 18:26:39 -03:00
Ben Sigman	27623a3b17	Merge pull request #131 from igorls/test/expand-coverage-and-uv-migration test: expand coverage from 20 to 92 tests, migrate to uv	2026-04-07 14:15:01 -07:00
Igor Lins e Silva	96de23cd97	fix: CI failures — update workflow for uv migration, fix lint and format - Switch CI install step from `pip install -r requirements.txt` to `pip install -e ".[dev]"` since requirements.txt was removed - Add noqa: E402 to intentionally-late imports in conftest.py (HOME must be isolated before mempalace imports) - Remove unused KnowledgeGraph import in test_knowledge_graph.py - Apply ruff formatting to test files	2026-04-07 17:59:21 -03:00
Ben Sigman	3068f75c2d	Merge pull request #22 from sheetsync/bugfix/split-known-names-loading refactor: consolidate split known-names config loading	2026-04-07 13:58:54 -07:00
Igor Lins e Silva	cd8b245fdc	fix: address Copilot review — remove unused imports, isolate HOME in tests, restore dev extra	2026-04-07 17:55:10 -03:00
Igor Lins e Silva	72c548b729	test: expand coverage from 20 to 92 tests, migrate to uv - Migrate from setuptools to hatchling build backend - Add dependency-groups (PEP 735) for dev tooling (pytest, ruff) - Remove redundant requirements.txt in favor of uv.lock - Fix __version__ mismatch (2.0.0 -> 3.0.0 to match pyproject.toml) New test files: - conftest.py: shared fixtures (isolated palace, KG, ChromaDB collection) - test_knowledge_graph.py: 17 tests (entity CRUD, temporal queries, timeline) - test_mcp_server.py: 25 tests (protocol dispatch, read/write/KG/diary tools) - test_searcher.py: 7 tests (search_memories API, filters, error handling) - test_dialect.py: 13 tests (AAAK compression, entity/emotion detection, zettel encoding) All 92 tests pass on Python 3.13 with chromadb 0.6.3.	2026-04-07 17:55:10 -03:00
Ben Sigman	e8f9b47e31	Merge pull request #16 from sheetsync/bugfix/version-consistency fix: unify package and MCP version reporting	2026-04-07 13:54:03 -07:00
ac-opensource	c8c220d789	fix: support nested .gitignore rules during mining	2026-04-08 00:02:21 +08:00
ac-opensource	9b9daa9b4b	fix: respect .gitignore during project mining	2026-04-07 22:26:06 +08:00
James Cane	0808ad96c2	refactor: consolidate split known-names config loading	2026-04-07 09:16:07 +01:00
James Cane	55152ce476	fix: unify package and MCP version reporting	2026-04-07 08:53:25 +01:00
bensig	0f8fa8c7d5	bench: add benchmark runners, results docs, and test suite Benchmarks: LongMemEval, LoCoMo, ConvoMem, MemBench runners with methodology docs and hybrid retrieval analysis. Tests: config, miner, convo_miner, normalize — 9 tests, all passing.	2026-04-04 18:33:42 -07:00

39 Commits