mempalace

Author	SHA1	Message	Date
copilot-swe-agent[bot]	24bf97bb65	fix(tests): avoid ONNX network download in update-length validation tests test_base_collection_update_default_validates_list_lengths and test_base_collection_update_default_rejects_mismatched_lengths were spinning up a real ChromaBackend and calling add(documents=...), which triggered ChromaDB's default ONNX embedding function and attempted a network download — failing in offline/sandboxed CI. BaseCollection.update() validates list lengths before any DB access, so no items need to be pre-loaded for the length-check to fire. Switch both tests to use _FakeCollection (same as the rest of the unit tests in this file) so they are pure in-memory and network-free. Also fixes a structural bug in test 1: collection._collection.add() was accidentally placed inside the pytest.raises(ValueError) block, masking the real assertion. Agent-Logs-Url: https://github.com/MemPalace/mempalace/sessions/55fc663e-b256-4b8b-88ce-4271560def8d Co-authored-by: igorls <4753812+igorls@users.noreply.github.com>	2026-04-18 16:23:58 +00:00
Igor Lins e Silva	42b940d263	fix(backends): address Copilot review on #995 Four defects surfaced by the automated review, fixed with targeted tests: 1. BaseCollection.update() default now validates that documents / metadatas / embeddings lengths match ids, raising ValueError instead of silently misaligning pairs or raising IndexError (base.py). 2. ChromaCollection.query() now rejects the two ambiguous input shapes up front — neither or both of query_texts / query_embeddings, and empty input lists — with clear ValueError messages rather than delegating to chromadb's less-obvious errors (chroma.py). 3. QueryResult.empty() accepts embeddings_requested=True to preserve the outer-query dimension with empty hit lists when the caller asked for embeddings, matching the spec rule that included fields carry the outer shape even when empty (base.py). ChromaCollection.query() threads this through on the empty-result path (chroma.py). 4. ChromaBackend cache-freshness check now matches the semantics from mcp_server._get_client (merged via #757) on three edge cases Copilot called out: (a) invalidate when chroma.sqlite3 disappears while a cached client is held, (b) treat a 0→nonzero stat transition as a change so a cache built when the DB did not yet exist is refreshed, (c) re-stat after PersistentClient constructs the DB lazily so freshness reflects the post-creation state (chroma.py). Tests: 978 passed (up from 970), 8 new tests covering the fixes.	2026-04-18 13:19:18 -03:00
Igor Lins e Silva	a17a8b734a	refactor(backends): typed QueryResult/GetResult, PalaceRef, BaseBackend registry (RFC 001 §10) Advances RFC 001 §10 cleanup so backend-author PRs (#574 LanceDB, #665 Postgres, #700 Qdrant, #697 hosted, #643 PalaceStore, #381 Qdrant) have a stable target to align against. Scope (this PR): - Typed QueryResult / GetResult dataclasses replace Chroma's dict shape at the BaseCollection boundary (§1.3). A transitional _DictCompatMixin keeps existing callers working while the attribute-access migration proceeds. - BaseCollection is now kwargs-only across add/upsert/query/get/delete/update with ABC defaults for estimated_count/close/health and a non-atomic default update() (§1.1–1.2). - PalaceRef replaces raw path strings at the backend boundary (§2.2). - BaseBackend ABC with get_collection/close_palace/close/health/detect (§2.3). - mempalace.backends entry-point group + in-tree registry with resolve_backend_for_palace priority order matching §3.2–3.3. - ChromaCollection normalizes chroma returns into typed results; unknown where-clause operators raise UnsupportedFilterError (no silent drop, §1.4). - ChromaBackend absorbs the inode/mtime client-cache freshness check previously duplicated in mcp_server._get_client() (§10 + PR #757). - searcher.py migrated to typed-attribute access as the reference call site; remaining callers land in a follow-up. - pyproject: chroma registered via [project.entry-points."mempalace.backends"]. Out of scope (explicit follow-ups): - Full caller migration off the dict-compat shim across palace.py, mcp_server.py, miner.py, convo_miner.py, dedup.py, repair.py, exporter.py, palace_graph.py, cli.py, closet_llm.py. - Embedder injection + three-state EmbedderIdentityMismatchError check (§1.5). - maintenance_state() / run_maintenance() benchmark hooks (§7.3). - AbstractBackendContractSuite full coverage (§7.1–7.2). - mempalace migrate / mempalace verify CLI rewrites through BaseCollection (§8). Tests: 970 passed (up from 967 on develop); new coverage for typed results, empty-result outer-shape preservation, \$regex rejection, registry lookup, priority resolver, and PalaceRef-kwarg ChromaBackend.get_collection. Refs: #743 (RFC 001), #989 (RFC 002 tracking issue).	2026-04-18 12:45:16 -03:00
Igor Lins e Silva	e4a2cd48a2	Merge pull request #984 from domiscd/feat/landing-page-update feat/landing-page: Improve landing page readability	2026-04-17 19:47:39 -03:00
Dominique Deschatre	2e3e0b979c	Update landing.css	2026-04-17 19:40:25 -03:00
Dominique Deschatre	9e8281aab5	(landing) svg icons animations	2026-04-17 19:37:30 -03:00
Dominique Deschatre	e5f5009f80	(landing) added Closets section	2026-04-17 19:18:10 -03:00
Dominique Deschatre	89f0eb5cb3	refactor(website): split Landing.vue into section components Extract 2002-line monolith into landing/ subfolder: - 8 section components (FolioHeader, HeroSection, ForgettingSection, AnatomySection, DialectSection, MechanicsSection, InstallSection, CatalogFooter) - useLandingEffects.js composable for all vanilla-JS effects - landing.css for all styles - Landing.vue reduced to 28-line orchestrator Also restores upstream hero lede text ("permanent. Designed for total recall.").	2026-04-17 18:49:41 -03:00
Dominique Deschatre	8c3d1ba86c	Merge remote-tracking branch 'upstream/develop' into feat/landing-page-update Co-authored-by: Copilot <copilot@github.com>	2026-04-17 17:00:47 -03:00
Dominique Deschatre	28d4f67ba2	landing hero container	2026-04-17 15:53:50 -03:00
Igor Lins e Silva	41bff266a4	Merge pull request #918 from almirus/develop feat(cli): add version display and version flag to CLI	2026-04-17 00:29:55 -03:00
Igor Lins e Silva	596f3d3a8e	Merge pull request #964 from MemPalace/fix/website-false-claims fix(website): correct false claims and stale numbers in live docs	2026-04-16 23:38:08 -03:00
Igor Lins e Silva	0cb9ee5c58	fix(website): correct false claims and stale numbers in live docs - Landing: replace nonexistent `mempalace remember` CLI demo with real `mempalace mine ./notes` - Landing: soften unverifiable absolutes ("forever available", "100% recall by design", "<50 ms", "90%+ compression", "two-thousand-year-old", "tens of thousands of entries") - MCP tool count: 19 → 29 across mcp-integration, claude-code, openclaw, and modules; expand tool overview with Drawers, Tunnels, and System categories to match mcp_server.py - Wake-up token range: ~170–900 → ~600–900 in cli/api-reference/python-api to match cli.py help text and concept docs - Gemini CLI: move `--scope user` before target name and add `--` separator so `-m mempalace.mcp_server` isn't parsed as Gemini flags	2026-04-16 23:31:35 -03:00
Igor Lins e Silva	51919fef0c	Merge pull request #963 from domiscd/feat/landing-page-update feat(website): update landing page	2026-04-16 22:37:16 -03:00
Dominique Deschatre	c8727b3a2d	chore(website): add Google Analytics	2026-04-16 22:34:37 -03:00
Dominique Deschatre	44c525ddd3	Merge remote-tracking branch 'upstream/develop' into feat/landing-page-update # Conflicts: # website/index.md	2026-04-16 22:31:22 -03:00
Dominique Deschatre	d8ac4c3abb	new landing page pt 2	2026-04-16 22:24:15 -03:00
Dominique Deschatre	9893fa2383	new landing page	2026-04-16 21:46:03 -03:00
Igor Lins e Silva	55a004fe1e	Merge pull request #931 from mvalentsev/fix/i18n-entity-metadata fix: use i18n candidate patterns for entity extraction in miner and palace	2026-04-16 15:54:01 -03:00
Igor Lins e Silva	c5e249bba8	Merge pull request #946 from mvalentsev/fix/utf8-read-text fix: add explicit UTF-8 encoding to read_text() calls (#776)	2026-04-16 15:52:42 -03:00
Igor Lins e Silva	65f99ad7e6	Merge pull request #928 from arnoldwender/fix/i18n-lang-case-insensitive fix(i18n): resolve language codes case-insensitively (#927)	2026-04-16 15:44:36 -03:00
Igor Lins e Silva	29112fab82	Merge pull request #778 from dominosaurs/feat/id-lang feat: add Indonesian language support	2026-04-16 15:44:26 -03:00
Igor Lins e Silva	4215be3926	Merge pull request #773 from tejasashinde/feat/add-i18n-hindi feat: add Hindi language support to i18n module	2026-04-16 15:44:08 -03:00
mvalentsev	09fe2dda3c	fix: add explicit UTF-8 encoding to read_text() calls (#776 ) On Windows with non-UTF-8 locale (e.g. GBK), Path.read_text() defaults to platform encoding, breaking onboarding tests and any source code that reads JSON/markdown with non-ASCII content. 5 files, 8 call sites fixed.	2026-04-16 16:00:29 +05:00
🍕	939d4c1e74	feat: Update Indonesian translations Refine AAAK instruction and expand entity detection patterns.	2026-04-16 17:43:51 +08:00
🍕	88f5b5fa0e	Add Indonesian language support Introduces the Indonesian (id) locale, providing translations for CLI commands, status messages, and core terminology. Includes language-specific regex patterns for stop words and action detection to support text processing and indexing in Indonesian. The test suite is updated with a sample case to verify correct dialect handling and compression.	2026-04-16 16:15:47 +08:00
mvalentsev	cde0f5b9e7	remove unnecessary comment	2026-04-16 10:38:38 +05:00
mvalentsev	973bd62a9a	fix: use pre-wrapped candidate patterns after #932 refactor	2026-04-16 10:37:18 +05:00
mvalentsev	8bf940f861	fix: use i18n candidate patterns for entity extraction in miner and palace entity_detector.py was refactored in #911 to load candidate patterns from i18n locale JSON files, supporting non-Latin scripts (Cyrillic, accented Latin, etc.). But three other code paths still hardcoded the ASCII-only regex [A-Z][a-z]{2,}, silently missing non-Latin entity names in metadata tagging, closet indexing, and registry lookups. Replace the hardcoded regex with a shared _candidate_entity_words() helper that reuses the same i18n candidate_patterns as entity_detector.	2026-04-16 10:35:40 +05:00
tejasashinde	21da870bd0	fix(i18n/hi): add boundary_chars and update action_pattern for Devanagari-aware matching	2026-04-16 09:21:21 +05:30
Igor Lins e Silva	d4c942417a	Merge pull request #932 from MemPalace/fix/entity-detector-non-latin-boundaries fix(entity_detector): script-aware word boundaries for combining-mark scripts	2026-04-15 22:38:59 -03:00
Igor Lins e Silva	f895bc58e6	fix(entity_detector): script-aware word boundaries for combining-mark scripts Python's \b is a \w/non-\w transition. Devanagari vowel signs (matras) like ा ी ु are Unicode category Mc (Mark, Spacing Combining) — not \w. This means \b splits mid-word on every matra: names like अनीता (Anita) truncate to अनीत, and person-verb patterns like \bराज\s+ने\s+कहा\b never match because \b fails after the final matra of कहा. Same issue affects Arabic, Hebrew, Thai, Tamil, and every other script whose words contain combining marks. Fix: locales with combining-mark scripts declare a boundary_chars field in their entity section (e.g. "\\w\\u0900-\\u097F" for Hindi). The i18n loader replaces every \b in that locale's patterns with a script-aware lookaround that treats the declared characters as "inside-word", and pre-wraps candidate/multi_word patterns with the same boundary. Default behavior (no boundary_chars) keeps standard \b — en, pt-br, ru, it are unchanged. Changes: - mempalace/i18n/__init__.py: add _script_boundary, _expand_b, _wrap_candidate, _collect_entity_section; candidate_patterns are now returned fully-wrapped (boundary + capture group applied) - mempalace/entity_detector.py: extract_candidates compiles pre-wrapped candidate patterns directly instead of re-wrapping with \b - tests/test_entity_detector.py: 5 new tests for Devanagari boundaries (name extraction with/without boundary_chars, person-verb firing, English regression)	2026-04-15 22:18:52 -03:00
Arnold Wender	6caac50138	fix(i18n): use Optional[str] for Python 3.9 compatibility PEP 604 union syntax (str \| None) requires Python 3.10+. The project supports 3.9 per CI matrix, so use typing.Optional instead.	2026-04-15 23:37:12 +02:00
Arnold Wender	0174b93d0f	fix(i18n): resolve language codes case-insensitively (#927 ) BCP 47 language tags are case-insensitive (RFC 5646 §2.1.1) but the locale files mix conventions (pt-br.json vs zh-CN.json). On case-sensitive filesystems, '--lang PT-BR' or '--lang zh-cn' silently missed the file, _load_entity_section returned {}, and entity detection ran in English with no warning. The cache key in get_entity_patterns was built from raw input, so ('PT-BR',) and ('pt-br',) produced two distinct entries, both wrong. Add _canonical_lang(lang) that resolves any casing to the on-disk filename stem via lowercase comparison, and route load_lang, _load_entity_section, and the cache key through it. Closes #927	2026-04-15 23:33:42 +02:00
Igor Lins e Silva	122ce38811	Merge pull request #907 from Archetipo95/feat/italian-i18n-support feat: add Italian language support	2026-04-15 18:05:13 -03:00
Igor Lins e Silva	57b0b14192	Merge pull request #156 from mvalentsev/feat/pt-br-entity-detection feat: add Brazilian Portuguese support to entity_detector (closes #117)	2026-04-15 17:53:30 -03:00
almirus	10cdd93cec	feat(cli): add version display and version flag to CLI Introduces a version label to the command-line interface, displaying the current MemPalace version in the help text. Adds a `--version` flag to allow users to easily check the version and exit.	2026-04-15 21:44:20 +03:00
mvalentsev	4221589df2	fix(i18n): address review feedback on pt-br.json - dialogue_patterns[0]: remove stray \" before > (fixes markdown quote matching) - entity stopwords: add 40 prepositions, conjunctions, and common words to reduce false positives - pronoun_patterns: add 2nd-person (você/vocês) and possessives (seu/sua/seus/suas)	2026-04-15 23:32:31 +05:00
mvalentsev	3d13a72ae0	feat(i18n): add Brazilian Portuguese locale with entity detection (closes #117 ) CLI strings, AAAK instruction, regex patterns, and entity section with person-verb, pronoun, dialogue, and candidate patterns for Latin+diacritics names (Joao, Ines, Angela). Follows the i18n entity framework from #911.	2026-04-15 23:32:31 +05:00
Tejas Shinde	33a98fb9d1	Updated hi.json to support infra for entity,pronoun_patterns,dialogue_patterns,direct_address_pattern, project_verb_patterns and stopwords	2026-04-15 23:33:24 +05:30
Tejas Shinde	ce3ae0a668	Merge branch 'MemPalace:develop' into feat/add-i18n-hindi	2026-04-15 23:19:57 +05:30
Martin Masevski	69453b2180	feat: add italian entity patterns	2026-04-15 19:18:23 +02:00
Martin Masevski	2e998db0b9	feat: add italian i18n support	2026-04-15 19:15:55 +02:00
Igor Lins e Silva	73a2f82d5b	Merge pull request #760 from mvalentsev/feat/i18n-russian feat: add Russian language support (ru.json)	2026-04-15 13:46:04 -03:00
Igor Lins e Silva	312b3b5f0e	Merge pull request #758 from mvalentsev/fix/i18n-review-issues fix: address i18n review issues from PR #718	2026-04-15 13:45:49 -03:00
mvalentsev	4b998de77a	feat(i18n): expand Russian entity stopwords with prepositions and conjunctions Adds 34 prepositions and conjunctions to reduce false positives in entity detection when these words appear sentence-initial. Co-Authored-By: almirus <almirus@users.noreply.github.com>	2026-04-15 21:14:51 +05:00
mvalentsev	3e49522a42	fix(i18n): apply review feedback on ru.json (#760 ) - mine_skip: "повторной раскопки" -> "повторной обработки" - quote_pattern: add Russian guillemet quotes «» Co-Authored-By: almirus <almirus@users.noreply.github.com>	2026-04-15 20:17:16 +05:00
mvalentsev	d6bd7de5f6	feat(i18n): add entity detection section to Russian locale Cyrillic candidate/multi-word patterns, person-verb patterns (сказал, спросил, ответил, etc.), pronoun patterns, dialogue markers, direct address, and Russian stopwords. Follows the i18n entity framework from #911.	2026-04-15 18:16:25 +05:00
mvalentsev	b87ada3c96	feat: add Russian language support to i18n module Add ru.json with full Russian translations for CLI strings, palace terminology, AAAK compression instruction, and regex patterns for topic/action extraction with Cyrillic character classes. No code changes needed -- the i18n module auto-discovers language files via *.json glob in the i18n directory.	2026-04-15 18:15:15 +05:00
Igor Lins e Silva	3bac3654c4	Merge pull request #911 from MemPalace/refactor/entity-detector-i18n refactor(entity_detector): make multi-language extensible via i18n JSON	2026-04-15 09:40:36 -03:00

1 2 3 4 5 ...

443 Commits