hls: per-library readiness gauges + GET /hls/stats endpoint

The hash-keyed pipeline transcodes lazily, so a freshly mounted (or
freshly upgraded) library is "mostly pending" for the first hour
while the watcher works through the backlog. The operator wants a
live read on remaining work so they can tune `HLS_CONCURRENCY` and
know when to stop waiting.

Adds:

- `src/hls_stats.rs` — pure compute path (`stats_from_rows`) and an
  Arc<Mutex<dyn ExifDao>> wrapper (`compute_and_publish`). Per
  library: `total`, `with_playlist`, `pending`, `unsupported`,
  `hashless_videos`. Dedup is by content_hash so duplicate-bytes-at-
  N-paths counts once (same domain rule as `faces::stats`).
  `hashless_videos` is a separate counter so the operator can see
  the "hash backfill, then transcode" pipeline depth instead of
  having NULL-hash rows just hide.

- Prometheus gauges labeled by library name:
  `imageserver_hls_videos_total`, `..._with_playlist`, `..._pending`,
  `..._unsupported`. Updated by the watcher at the end of every full-
  scan tick *and* on every `/hls/stats` hit, so whichever surface the
  operator is watching stays fresh. Registered in `main` alongside
  the existing image/video gauges.

- `GET /hls/stats` — Claims-protected JSON snapshot of the same data
  plus a top-level cross-library aggregate. Runs on a blocking pool
  so it doesn't pin the actix worker; per-call cost is one
  `list_paths_and_hashes_for_library` SQL query per library plus a
  `stat()` per distinct video hash. Bounded — never invoked from
  middleware, only from the explicit endpoint and the full-scan
  tick. The watcher's end-of-tick `info!` summary line mirrors the
  endpoint output for operators tailing the log.

- New `ExifDao::list_paths_and_hashes_for_library` method:
  `SELECT rel_path, content_hash FROM image_exif WHERE library_id =
  ?`. Single round-trip; callers filter to video extensions
  client-side because the schema doesn't carry media-type. Mock
  impl in `files.rs` returns an empty vec.

Tests in `hls_stats::tests` exercise stats_from_rows directly (videos-
only filter, hash dedup, playlist vs sentinel decision, NULL-hash
hashless counting) plus a publish_gauges round-trip that reads the
gauge value back. Full suite (347 lib + 360 bin = 707) passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-05-14 15:58:46 -04:00
parent 7c153596fe
commit 7cd1ea3cf8
5 changed files with 488 additions and 0 deletions

View File

@@ -32,6 +32,7 @@ use crate::exif;
use crate::face_watch;
use crate::faces;
use crate::file_types;
use crate::hls_stats;
use crate::libraries;
use crate::library_maintenance;
use crate::perceptual_hash;
@@ -580,6 +581,20 @@ pub fn watch_files(
}
if is_full_scan {
// End-of-full-scan HLS readiness summary: log a single
// info line + refresh the Prometheus gauges. Skipped on
// quick scans because the cost is non-trivial on big
// libraries and the data only meaningfully changes on
// full passes.
let video_dir_str =
dotenv::var("VIDEO_PATH").expect("VIDEO_PATH must be set");
let stats = hls_stats::compute_and_publish(
&libs,
&exif_dao,
Path::new(&video_dir_str),
);
hls_stats::log_summary(&stats);
last_full_scan = now;
}
last_quick_scan = now;