mempalace

jason/mempalace

Fork 0

Commit Graph

Author	SHA1	Message	Date
Igor Lins e Silva	ebc26f3960	fix: resolve formatting, regression logic, and pytest defaults - Run ruff format on all benchmark files (fixes CI lint job) - Fix check_regression() substring ambiguity: ordered keyword matching so "latency_improvement_pct" is correctly classified as higher-is-better - Update stale comments in conftest.py referencing wrong fixture - Add pytest addopts to skip benchmark/slow/stress markers by default	2026-04-08 10:56:39 -03:00
Igor Lins e Silva	7e4db33061	fix: resolve ruff lint errors in benchmark suite Remove unused imports (shutil, string, datetime, os, yaml, time, SCALE_CONFIGS) and unused variable assignments in timing-only calls.	2026-04-08 05:10:39 -03:00
Igor Lins e Silva	e8017ca2ec	bench: add per-room recall threshold test Concentrates all drawers into a single wing+room to isolate the embedding model's retrieval limit independent of palace filtering. Confirms recall degrades to ~0.4-0.5 at 5K drawers per room even with wing+room filters applied — the spatial structure helps by keeping buckets small, but can't fix the underlying embedding ceiling.	2026-04-08 05:06:31 -03:00

Author

SHA1

Message

Date

Igor Lins e Silva

ebc26f3960

fix: resolve formatting, regression logic, and pytest defaults

- Run ruff format on all benchmark files (fixes CI lint job)
- Fix check_regression() substring ambiguity: ordered keyword matching
  so "latency_improvement_pct" is correctly classified as higher-is-better
- Update stale comments in conftest.py referencing wrong fixture
- Add pytest addopts to skip benchmark/slow/stress markers by default

2026-04-08 10:56:39 -03:00

Igor Lins e Silva

7e4db33061

fix: resolve ruff lint errors in benchmark suite

Remove unused imports (shutil, string, datetime, os, yaml, time,
SCALE_CONFIGS) and unused variable assignments in timing-only calls.

2026-04-08 05:10:39 -03:00

Igor Lins e Silva

e8017ca2ec

bench: add per-room recall threshold test

Concentrates all drawers into a single wing+room to isolate the
embedding model's retrieval limit independent of palace filtering.
Confirms recall degrades to ~0.4-0.5 at 5K drawers per room even
with wing+room filters applied — the spatial structure helps by
keeping buckets small, but can't fix the underlying embedding ceiling.

2026-04-08 05:06:31 -03:00

3 Commits