Files
mempalace/website/guide/searching.md
T
Igor Lins e Silva f20a1a30fe docs(website): align mempalaceofficial.com with honest benchmarks
Part of #875. Bring the VitePress site into line with the new README
and the reproducibility scorecard: drop category-error comparisons,
drop retracted claims, retain only metrics and caveats that survive
audit.

website/index.md
 - New tagline matches README (local-first, verbatim, pluggable backend,
   96.6% R@5 raw, zero API calls).
 - Replace the "MemPalace hybrid 100% / Supermemory ~99% / Mastra
   94.87% / Mem0 ~85%" comparison table with a single honest table
   showing MemPalace's own retrieval-recall numbers (raw 96.6%,
   hybrid v4 held-out 98.4%). Add an explicit sentence explaining why
   we no longer publish a cross-system table on the landing page
   (retrieval recall vs QA accuracy are different metrics).
 - Soften the "ChromaDB-powered vector search" feature blurb to be
   backend-agnostic, since the retrieval layer is pluggable.

website/reference/benchmarks.md
 - Full rewrite of the retrieval-recall tables. No more "100%"
   headline; honest held-out 98.4% R@5 replaces it. Added the
   model-agnostic rerank result (99.2% R@5 / 100% R@10 with
   minimax-m2.7 via Ollama) to show the pipeline is not Haiku-specific.
 - Drop the LoCoMo "Hybrid v5 + Sonnet rerank (top-50) 100%" row.
   With per-conversation session counts of 19-32 and top_k=50, the
   retrieval stage returns every session by construction — the number
   measures an LLM's reading comprehension, not retrieval.
 - Drop the cross-system comparison tables. Link out to each project's
   own research page (Mastra, Mem0, Supermemory) for their published
   numbers and metric definitions.
 - Rewrite reproduction commands to use the correct repository and
   demonstrate the new --llm-backend ollama flag.

website/concepts/the-palace.md
 - Remove the "+34%" row / paragraph. Wing/room filtering is standard
   metadata filtering in the vector store, not a novel retrieval
   mechanism — the April-7 note already retracted that framing; this
   finishes the retraction on the website where it had remained.

website/guide/searching.md
 - Same treatment for "34% retrieval improvement". Reframe as
   operational scoping, not a novel boost.

website/reference/contributing.md
 - Update the "palace structure matters" bullet to reflect the same
   framing: scoping-not-magic.

website/concepts/knowledge-graph.md
 - Replace the MemPalace-vs-Zep feature matrix with a short "related
   work" note that links to Zep's own documentation for authoritative
   details on their deployment model. Avoids claims we cannot verify
   at source.
2026-04-14 21:37:45 -03:00

101 lines
3.0 KiB
Markdown

# Searching Memories
MemPalace uses ChromaDB's semantic vector search to find relevant memories. When you search, you get **verbatim text** — the exact words, never summaries.
## CLI Search
```bash
# Search everything
mempalace search "why did we switch to GraphQL"
# Filter by wing (project)
mempalace search "database decision" --wing myapp
# Filter by room (topic)
mempalace search "auth decisions" --room auth-migration
# Filter by both
mempalace search "pricing" --wing driftwood --room costs
# More results
mempalace search "deploy process" --results 10
```
## How Search Works
1. Your query is embedded using the vector store's default model (`all-MiniLM-L6-v2` with the default ChromaDB backend).
2. The embedding is compared against all drawers using cosine similarity.
3. Optional wing/room filters narrow the search scope — standard metadata filtering in the underlying vector store.
4. Results are returned with similarity scores and source metadata.
### Why Scoping Matters
Wing/room filtering is useful when a single palace contains many unrelated projects or people. Narrowing the search to a specific wing (or wing + room) means the vector store only scores candidates inside that scope, which keeps retrieval predictable as the palace grows.
This is a metadata-filter feature of the vector store, not a novel retrieval mechanism. Treat it as an operational convenience: clear scoping rules that a human or an agent can apply predictably.
## Programmatic Search
Use the Python API for integration:
```python
from mempalace.searcher import search_memories
results = search_memories(
query="auth decisions",
palace_path="~/.mempalace/palace",
wing="myapp",
room="auth",
n_results=5,
)
for hit in results["results"]:
print(f"[{hit['similarity']}] {hit['wing']}/{hit['room']}")
print(f" {hit['text'][:200]}")
```
The `search_memories()` function returns a dict:
```python
{
"query": "auth decisions",
"filters": {"wing": "myapp", "room": "auth"},
"results": [
{
"text": "We decided to migrate auth to Clerk because...",
"wing": "myapp",
"room": "auth-migration",
"source_file": "session_2026-01-15.md",
"similarity": 0.892,
},
# ...
],
}
```
## MCP Search
When connected via MCP, your AI searches automatically:
> *"What did we decide about auth last month?"*
The AI calls `mempalace_search` behind the scenes. You never type a search command.
See [MCP Integration](/guide/mcp-integration) for setup.
## Wake-Up Context
Instead of searching, you can load a compact context of your world:
```bash
# Load identity + top memories (~600-900 tokens in typical use)
mempalace wake-up
# Project-specific context
mempalace wake-up --wing driftwood
```
This loads Layer 0 (identity) and Layer 1 (essential story) as bounded startup context before the first retrieval call.
See [Memory Stack](/concepts/memory-stack) for details on the 4-layer architecture.