Commit Graph

4 Commits

Author SHA1 Message Date
Igor Lins e Silva c35686c9e1 docs(install): recommend uv as the package manager
End-user installs now lead with `uv tool install mempalace`, with
`pip install mempalace` kept as a fallback. Dev/contributor docs lead
with `uv sync --extra dev` and `uv run` for tests/benchmarks/lint, with
the equivalent pip recipe kept inline. The shipped `/mempalace:init`
skill instructions (mempalace/instructions/init.md) try `uv tool install`
first when uv is on PATH, then fall back through the pip variants.

Adds a .python-version pin at 3.12 because the lockfile's
onnxruntime==1.24.3 only ships wheels for Python >=3.11; without the
pin, `uv sync` on a host where uv prefers 3.10 fails with no source
distribution available, which would make the documented command a
footgun. pyproject's `requires-python = ">=3.9"` is unchanged — pip
users on 3.9/3.10 are unaffected.

Files updated: README.md, CONTRIBUTING.md, CLAUDE.md, the gemini-cli
guide and example, the .claude-plugin / .codex-plugin READMEs, the
mempalace SKILL, the openclaw SKILL, tools/save.md, the three
benchmarks docs, and the corresponding website mirrors.
2026-05-08 01:38:00 -03:00
Igor Lins e Silva bf3b9c5979 docs: #875 follow-up — repo surfaces + reproduction URLs + CHANGELOG
Remaining in-repo surfaces carrying the same retracted or broken
claims as the public pages fixed in the previous two commits.

CONTRIBUTING.md
 - "Palace structure matters ... 34% retrieval improvement" → reframed
   as scoping (same rewording applied to the website equivalents).

benchmarks/BENCHMARKS.md
 - Add a prominent "Important caveat" block at the top of the
   "Comparison vs Published Systems" table explaining that R@5
   (retrieval recall) and QA accuracy are different metrics, with
   citations to Mastra, Mem0, and Supermemory's own published
   methodology pages. Annotate the specific competitor rows whose
   numbers are QA accuracy, not retrieval recall.
 - Annotate the `hybrid v4 + rerank 100%` row to note that the 99.4
   → 100 step was tuned on 3 specific wrong answers (already disclosed
   further down in the doc under "Benchmark Integrity"); the honest
   hybrid figure is held-out 98.4%.
 - Fix the broken clone URL — `aya-thekeeper/mempal` no longer points
   at anything; now `MemPalace/mempalace`.

benchmarks/README.md + benchmarks/HYBRID_MODE.md
 - Same clone-URL fix applied.

CHANGELOG.md
 - Add a ### Documentation entry under [Unreleased] v3.3.0 that names
   #875 and summarises the scope of the rewrite.
2026-04-14 21:38:00 -03:00
travisBREAKS 89206107fa fix(bench): remove hardcoded credential paths from benchmark runners (#177)
The `_load_api_key()` function in longmemeval_bench.py and locomo_bench.py
searched for API keys in a fixed path (`~/.config/lu/keys.json`) using
personal key names (`anthropic_milla`, `anthropic_claude_code_main`).

This leaks internal infrastructure details into the public codebase and
trains contributors to store credentials in a non-standard location
rather than using the standard ANTHROPIC_API_KEY env var.

Simplified to: CLI flag > env var > empty string. Updated help text
and HYBRID_MODE.md docs to match.

Co-authored-by: Tadao <tadao@travisfixes.com>
2026-04-11 23:14:36 -07:00
bensig 0f8fa8c7d5 bench: add benchmark runners, results docs, and test suite
Benchmarks: LongMemEval, LoCoMo, ConvoMem, MemBench runners with
methodology docs and hybrid retrieval analysis.

Tests: config, miner, convo_miner, normalize — 9 tests, all passing.
2026-04-04 18:33:42 -07:00