Files
nyaa-crawler/AGENTS.md
jason ded0875e72 feat: initial full-stack nyaa-crawler implementation
- Node.js + TypeScript + Express backend using built-in node:sqlite
- React + Vite frontend with dark-themed UI
- Nyaa.si RSS polling via fast-xml-parser
- Watch list with show/episode CRUD and status tracking
- Auto-download scheduler with node-cron (configurable interval)
- .torrent file downloader with batch-release filtering
- Settings page for poll interval and quality defaults
- Dockerfile and docker-compose for Unraid deployment
- SQLite DB with migrations (shows, episodes, settings tables)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 14:00:09 -05:00

8.7 KiB
Raw Blame History

AGENTS.md

Mission

Build a small, dockerized web service that lets a user:

  • Search and select anime releases from Nyaa.si.
  • Persist a personal “watch list” of shows and their release patterns.
  • Poll Nyaa (via RSS or lightweight scraping / API wrapper) for new episodes.
  • Automatically download the next .torrent file for each tracked show into a host-mounted download directory.
  • Track which episodes are:
    • Automatically downloaded (auto-checked),
    • Manually checked as already downloaded by the user.

Target deployment is an Unraid server using a single Docker container with a simple web UI and a lightweight persistence layer (SQLite preferred).[^1]


High-level Architecture

  • Frontend: Minimal web UI (SPA or server-rendered) for:
    • Searching Nyaa.si.
    • Adding/removing shows from the watch list.
    • Viewing episodes per show with status (pending, downloaded).
    • Manually checking episodes as downloaded.
  • Backend:
    • HTTP API for the UI.
    • Nyaa integration (RSS and/or search scraping).
    • Scheduler/worker to periodically poll Nyaa and enqueue downloads.
    • Torrent fetcher that downloads .torrent files to a host-mounted directory.
  • Data store:
    • SQLite database stored on a bind-mounted volume for easy backup and migration.
  • Containerization:
    • Single Docker image with app + scheduler.
    • Config via environment variables.
    • Unraid-friendly: configurable ports, volume mapping for DB and torrents.[^2][^1]

Functional Requirements

1. Nyaa Integration

  • Use Nyaas RSS endpoints for polling where possible (e.g. https://nyaa.si/?page=rss plus query parameters), falling back to HTML scraping or an existing wrapper library if necessary.[^3][^4][^5][^6][^7]
  • Support user-driven search:
    • Input: search term (e.g. “Jujutsu Kaisen 1080p SubsPlease”).
    • Output: recent matching torrents with:
      • Title
      • Torrent ID
      • Category
      • Size
      • Magnet/torrent link URL if exposed in the feed or page.[^8][^9][^10]
  • When a user “adds” an anime:
    • Store a normalized pattern to match future episodes (e.g. base title + quality/resolution + sub group).
    • Maintain reference to the Nyaa search or RSS query that defines this feed.[^6][^3]

2. Watch List & Episodes

  • Entities:
    • Show: id, display name, search/RSS query, quality filter, fansub group, active flag.
    • Episode: id, show_id, episode_number (string or parsed integer), nyaa_torrent_id, title, status (pending, downloaded_auto, downloaded_manual), torrent_url, created_at, downloaded_at.
  • Behavior:
    • Adding a show:
      • Run an immediate search.
      • Populate existing episodes in DB as pending (no download) to let the user backfill by manually checking already downloaded ones.
    • Removing a show:
      • Leave episodes in DB but mark show as inactive (no further polling).
    • Manual check:
      • User can mark an episode as already downloaded (downloaded_manual), no torrent action taken.

3. Auto-Download Logic

  • Periodic job (e.g. every 515 minutes, configurable):
    • For each active show:
      • Query Nyaa using its stored RSS/search parameters.[^4][^3][^6]
      • Determine the “next” episode:
        • Prefer simplest rule: highest episode number not yet marked downloaded.
        • Guard against batch torrents by using size or title pattern heuristics (e.g. skip titles containing “Batch”).
      • If the next episodes torrent is not yet in DB:
        • Create an Episode record with status downloaded_auto.
        • Download the .torrent file (NOT the media itself) into the mapped host directory.
          • Filename suggestion: <show-slug>-ep<episode>-<torrent-id>.torrent.
  • Do not attempt to control or integrate directly with a torrent client (scope is “download the .torrent file” only).

4. Web UI

  • Views:
    • Shows list:
      • Add show (form: name, search query, quality, group).
      • Toggle active/inactive.
      • Quick link to show detail.
    • Show detail:
      • Table of episodes: episode number/title, Nyaa ID, status, timestamps.
      • Controls:
        • Manually mark individual episodes as downloaded.
        • Bulk “mark previous episodes as downloaded” helper (e.g. “mark up to episode N”).
    • Settings:
      • Poll interval.
      • Default quality / sub group preferences.
      • Torrent download directory (read-only display; actual path comes from environment/volume).
  • UX constraints:
    • Keep it extremely simple; focus is internal tool.
    • Assume a single user instance behind LAN.

Non-Functional Requirements

  • Language/Stack:
    • Prefer Node.js + TypeScript backend with a minimal React or server-rendered frontend to align with existing projects, unless you choose a simpler stack.
  • Security:
    • App is assumed to run behind LAN; basic auth or reverse-proxy auth can be added later.
    • Do not expose any admin-only functionality without at least a simple auth hook.
  • Resilience:
    • Polling should be robust to Nyaa timeouts and 4xx/5xx responses (retry with backoff, log errors).
    • Do not spam Nyaa with aggressive polling; default interval should be conservative (e.g. 15 minutes, configurable).
  • Observability:
    • Minimal logging for:
      • Polling attempts.
      • New episodes found.
      • Torrent downloads started/completed or failed.

Data Model (Initial)

Tables

  • shows
    • id (PK)
    • name (string)
    • search_query (string)
    • quality (string, nullable)
    • sub_group (string, nullable)
    • rss_url (string, nullable)
    • is_active (boolean, default true)
    • created_at, updated_at
  • episodes
    • id (PK)
    • show_id (FK → shows.id)
    • episode_code (string, e.g. “S01E03” or “03”)
    • title (string)
    • torrent_id (string, Nyaa ID)
    • torrent_url (string)
    • status (enum: pending, downloaded_auto, downloaded_manual)
    • downloaded_at (datetime, nullable)
    • created_at, updated_at

Container & Unraid Integration

Environment

  • PORT HTTP port to listen on (default 3000).
  • POLL_INTERVAL_SECONDS Polling frequency.
  • TORRENT_OUTPUT_DIR Inside-container path where .torrent files are written.
  • DATABASE_PATH Inside-container path to SQLite file.

Volumes

  • Map SQLite DB to persistent storage:
    • /data/db.sqlite → Unraid share: e.g. /mnt/user/appdata/nyaa-watcher/db.sqlite.[^1][^2]
  • Map torrent output directory to a download share:
    • /data/torrents → e.g. /mnt/user/downloads/torrents/nyaa/.

Ports

  • Expose app port to LAN (bridge mode):
    • Container: 3000, Host: YOUR_PORT (e.g. 8082).

Example docker-compose snippet

services:
  nyaa-watcher:
    image: your-registry/nyaa-watcher:latest
    container_name: nyaa-watcher
    restart: unless-stopped
    environment:
      - PORT=3000
      - POLL_INTERVAL_SECONDS=900
      - TORRENT_OUTPUT_DIR=/data/torrents
      - DATABASE_PATH=/data/db.sqlite
    volumes:
      - /mnt/user/appdata/nyaa-watcher:/data
      - /mnt/user/downloads/torrents/nyaa:/data/torrents
    ports:
      - "8082:3000"

This can be translated to an Unraid template or used via docker-compose with a Docker context pointing at the Unraid host.[^11][^2][^1]


Implementation Roadmap

  1. Skeleton app
    • Set up HTTP server, health endpoint, and a static web page.
    • Wire SQLite with migrations for shows and episodes.
  2. Nyaa client
    • Implement RSS-based polling for a hard-coded query.
    • Parse feed, extract torrent IDs, titles, and links.[^5][^3][^4][^6]
    • Optionally evaluate an existing node nyaa-si wrapper as a shortcut.[^7]
  3. Watch list CRUD
    • API endpoints + UI for managing shows.
    • Initial search → show add flow.
  4. Episode tracking
    • When adding a show, ingest existing feed items into episodes as pending.
    • Implement manual check/mark endpoints and UI.
  5. Auto-download worker
    • Background job to poll active shows and write .torrent files.
    • Update episode status to downloaded_auto.
  6. Dockerization & Unraid deployment
    • Dockerfile, volume mappings, environment configuration.
    • Test deployment on Unraid, ensure persistence and torrent file visibility.
  7. Polish
    • Basic auth or IP allowlist if desired.
    • Guardrails against batch torrent downloads.
    • Minimal styling for the UI.

Open Questions for Product Owner

  • What poll interval do you consider acceptable by default (e.g. 5, 10, or 15 minutes)?
  • Do you want any basic auth in front of the UI out of the box, or will this live behind an existing reverse proxy?