# JARVIS Windows — Project Roadmap ## Phase 1 — Core Stack ✅ (Week 1) - [ ] FastAPI server with WebSocket endpoint - [ ] Whisper STT (base model, local) - [ ] Fish Audio TTS integration - [ ] Claude API connection - [ ] WebSocket manager (connect/disconnect/broadcast) - [ ] Basic end-to-end voice loop (speak → transcribe → Claude → respond) ## Phase 2 — Google Workspace Integrations (Weeks 2–3) ### 2a — Auth + Core Services - [ ] Google OAuth 2.0 setup (credentials.json → token.json) - [ ] Google Calendar: read today's events, upcoming schedule - [ ] Gmail: unread messages, search by sender/subject - [ ] Google Tasks: list pending, create, complete ### 2b — Notes + Screen - [ ] Google Keep via gkeepapi (or Markdown fallback) - [ ] Screen vision: mss screenshot + win32gui active window - [ ] Window list enumeration ### 2c — System Capabilities - [ ] Terminal control via subprocess (PowerShell + CMD) - [ ] File system: create projects, manage folders - [ ] Git status/operations via subprocess - [ ] Chrome automation via Selenium ## Phase 3 — Claude Tool Use (Week 4) - [ ] Define tools manifest (all capabilities as function specs) - [ ] Tool dispatch router (route Claude tool_use to capability modules) - [ ] Multi-turn conversation with tool results - [ ] Error handling + tool fallbacks ## Phase 4 — Dashboard + Packaging (Week 5) - [ ] Wire WebSocket to Three.js dashboard frontend - [ ] Live transcript display - [ ] State: idle / listening / processing / speaking - [ ] Windows startup registration (Task Scheduler) - [ ] .env validation on startup ## Phase 5 — Optional Enhancements - [ ] Docker containerization (for Unraid or headless server) - [ ] Wake word detection ("Hey JARVIS") - [ ] Conversation history / memory across sessions - [ ] Google Drive file access - [ ] React/Electron desktop wrapper