Files
jarvis/ROADMAP.md
2026-03-24 00:11:34 -05:00

1.8 KiB
Raw Permalink Blame History

JARVIS Windows — Project Roadmap

Phase 1 — Core Stack (Week 1)

  • FastAPI server with WebSocket endpoint
  • Whisper STT (base model, local)
  • Fish Audio TTS integration
  • Claude API connection
  • WebSocket manager (connect/disconnect/broadcast)
  • Basic end-to-end voice loop (speak → transcribe → Claude → respond)

Phase 2 — Google Workspace Integrations (Weeks 23)

2a — Auth + Core Services

  • Google OAuth 2.0 setup (credentials.json → token.json)
  • Google Calendar: read today's events, upcoming schedule
  • Gmail: unread messages, search by sender/subject
  • Google Tasks: list pending, create, complete

2b — Notes + Screen

  • Google Keep via gkeepapi (or Markdown fallback)
  • Screen vision: mss screenshot + win32gui active window
  • Window list enumeration

2c — System Capabilities

  • Terminal control via subprocess (PowerShell + CMD)
  • File system: create projects, manage folders
  • Git status/operations via subprocess
  • Chrome automation via Selenium

Phase 3 — Claude Tool Use (Week 4)

  • Define tools manifest (all capabilities as function specs)
  • Tool dispatch router (route Claude tool_use to capability modules)
  • Multi-turn conversation with tool results
  • Error handling + tool fallbacks

Phase 4 — Dashboard + Packaging (Week 5)

  • Wire WebSocket to Three.js dashboard frontend
  • Live transcript display
  • State: idle / listening / processing / speaking
  • Windows startup registration (Task Scheduler)
  • .env validation on startup

Phase 5 — Optional Enhancements

  • Docker containerization (for Unraid or headless server)
  • Wake word detection ("Hey JARVIS")
  • Conversation history / memory across sessions
  • Google Drive file access
  • React/Electron desktop wrapper