Files
mempalace/website/concepts/memory-stack.md
T
Igor Lins e Silva dfb22f5345 docs: add VitePress documentation site
- 22 content pages across Guide, Concepts, and Reference sections
- Custom indigo/cyan theme with Lucide icons and Mermaid diagrams
- GitHub Actions workflow for GitHub Pages deployment
- Live preview: https://mempalace-docs.netlify.app/
2026-04-09 19:41:08 -03:00

105 lines
3.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Memory Stack
MemPalace uses a 4-layer memory stack. Each layer loads progressively more data only when needed.
## The Layers
| Layer | What | Size | When |
|-------|------|------|------|
| **L0** | Identity — who is this AI? | ~50-100 tokens | Always loaded |
| **L1** | Essential Story — top moments | ~500-800 tokens | Always loaded |
| **L2** | Room Recall — filtered retrieval | ~200500 each | When topic comes up |
| **L3** | Deep Search — full semantic query | Variable | When explicitly asked |
In the current implementation, a typical wake-up is roughly **~600-900 tokens** for L0 + L1. Searches only fire when needed.
## Layer 0: Identity
A plain text file at `~/.mempalace/identity.txt`. Always loaded as the AI's self-concept.
```text
I am Atlas, a personal AI assistant for Alice.
Traits: warm, direct, remembers everything.
People: Alice (creator), Bob (Alice's partner).
Project: A journaling app that helps people process emotions.
```
~50 tokens. Tells the AI who it is and who it works with.
## Layer 1: Essential Story
Auto-generated from the highest-importance drawers in the palace. Groups by room, picks the top moments, and keeps the output bounded.
The generation process:
1. Reads all drawers from ChromaDB
2. Scores each by importance/emotional weight
3. Takes the top 15 moments
4. Groups by room for readability
5. Truncates to fit within 3,200 characters
```
## L1 — ESSENTIAL STORY
[auth-migration]
- Team decided to migrate from Auth0 to Clerk — pricing + DX (session_2026-01-15.md)
- Kai debugged the OAuth token refresh issue (session_2026-01-20.md)
[deploy-process]
- Switched to blue-green deploys after the January outage (session_2026-02-01.md)
```
## Layer 2: On-Demand Recall
Loaded when a specific topic or wing comes up in conversation. Retrieves drawers filtered by wing and/or room — typically ~200500 tokens.
```python
stack = MemoryStack()
stack.recall(wing="driftwood", room="auth")
# → returns recent drawers about auth in the driftwood project
```
## Layer 3: Deep Search
Full semantic search against the entire palace. This is what fires when you or the AI explicitly asks a question.
```python
stack.search("why did we switch to GraphQL")
# → returns top-5 matching drawers with similarity scores
```
## Wake-Up Budget
The point of the stack is bounded startup context, not a fixed universal token count. The exact size depends on your identity file and what Layer 1 selects, but the implementation keeps wake-up meaningfully smaller than loading the full corpus into the prompt.
## Using the Stack
### CLI
```bash
# Wake-up context (L0 + L1)
mempalace wake-up
# Project-specific wake-up
mempalace wake-up --wing driftwood
```
### Python API
```python
from mempalace.layers import MemoryStack
stack = MemoryStack()
# L0 + L1: wake-up (~600-900 tokens in typical use)
print(stack.wake_up())
# L2: on-demand recall
print(stack.recall(wing="myapp"))
# L3: deep search
print(stack.search("pricing change"))
# Status
print(stack.status())
```