multi-library: per-library excluded_dirs

Adds a nullable comma-separated TEXT column to the libraries table.
Effective excludes for a walk = (env-var globals) ∪
(library.excluded_dirs). Empty / NULL = no library-specific
extras; the global env var still applies.

Migration (2026-05-01-110000_libraries_excluded_dirs)

  ALTER TABLE libraries ADD COLUMN excluded_dirs TEXT. NULL on every
  existing row — no behavior change on upgrade.

Library struct + helpers (libraries.rs)

  - Library gains excluded_dirs: Vec<String>, parsed from the column
    by parse_excluded_dirs_column (drops empties / whitespace,
    matches the env-var parser).
  - Library::effective_excluded_dirs(globals) returns the union.
  - From<LibraryRow> hydrates the field on AppState construction so
    /libraries surfaces it.

Watcher / walkers / memories

  Every per-library walker now consults the effective set:
    - process_new_files (file-watch ingest, RAW/EXIF/face)
    - process_face_backlog (filter_excluded inherits)
    - create_thumbnails (startup + new-file branch)
    - update_media_counts (Prometheus gauge)
    - cleanup_orphaned_playlists (per-library source-existence check)
    - memories endpoint (PathExcluder)

  Effective set is computed once per per-library iteration in the
  watcher tick and threaded through; called functions retain their
  flat &[String] signature (no per-library awareness needed inside
  the walker primitives).

Use case: mount a parent directory while a sibling library covers
a child subtree, and exclude the child subtree from the parent so
the libraries don't double-walk / double-write image_exif. With
hash-keyed derived data (Branches B/C), the duplication-avoidance
is the only cost prevented — face / tag / insight sharing was
already correct via content_hash.

Tests: 228 pass (226 from previous + 2 new in libraries::tests:
parse_excluded_dirs_column edge cases,
effective_excluded_dirs_unions_global_and_per_library).

CLAUDE.md gains a "Per-library excludes" subsection of the
multi-library data model.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-05-01 19:54:17 +00:00
parent 4f17af688e
commit 814066551e
11 changed files with 149 additions and 8 deletions

View File

@@ -210,6 +210,22 @@ Typical workflows: stage a new mount with `enabled=0` then flip to `1`;
quiet a flaky NAS during maintenance without disturbing the rest of
the system.
**Per-library excludes (`libraries.excluded_dirs`).** A
comma-separated column, same shape as the global `EXCLUDED_DIRS` env
var, that's applied **in union** with the env-var globals when a
walker scans this library. Use case: mount a parent directory as a
new library while a sibling library covers a child subtree, and
exclude that child subtree from the parent so the two libraries
don't double-walk and double-write `image_exif`. Hash-keyed derived
data (faces, tags, insights) is unaffected either way — those
follow the bytes — but `image_exif` row count, walker CPU, and
thumbnail disk usage all drop to 1× instead of 2× for the overlap.
Affects: file-watch ingest (`process_new_files`), thumbnail
generation, media-count gauges, the orphaned-playlist cleanup walk,
and the `/memories` endpoint. The face-detection backlog drain
inherits via `face_watch::filter_excluded`. NULL = no extras (only
the global env var applies).
**Library availability and safety.** Libraries can be on network shares
or removable media; the file watcher must not interpret a temporary
unavailability as a mass-deletion event. Every tick begins with a