hls: per-library readiness gauges + GET /hls/stats endpoint
The hash-keyed pipeline transcodes lazily, so a freshly mounted (or freshly upgraded) library is "mostly pending" for the first hour while the watcher works through the backlog. The operator wants a live read on remaining work so they can tune `HLS_CONCURRENCY` and know when to stop waiting. Adds: - `src/hls_stats.rs` — pure compute path (`stats_from_rows`) and an Arc<Mutex<dyn ExifDao>> wrapper (`compute_and_publish`). Per library: `total`, `with_playlist`, `pending`, `unsupported`, `hashless_videos`. Dedup is by content_hash so duplicate-bytes-at- N-paths counts once (same domain rule as `faces::stats`). `hashless_videos` is a separate counter so the operator can see the "hash backfill, then transcode" pipeline depth instead of having NULL-hash rows just hide. - Prometheus gauges labeled by library name: `imageserver_hls_videos_total`, `..._with_playlist`, `..._pending`, `..._unsupported`. Updated by the watcher at the end of every full- scan tick *and* on every `/hls/stats` hit, so whichever surface the operator is watching stays fresh. Registered in `main` alongside the existing image/video gauges. - `GET /hls/stats` — Claims-protected JSON snapshot of the same data plus a top-level cross-library aggregate. Runs on a blocking pool so it doesn't pin the actix worker; per-call cost is one `list_paths_and_hashes_for_library` SQL query per library plus a `stat()` per distinct video hash. Bounded — never invoked from middleware, only from the explicit endpoint and the full-scan tick. The watcher's end-of-tick `info!` summary line mirrors the endpoint output for operators tailing the log. - New `ExifDao::list_paths_and_hashes_for_library` method: `SELECT rel_path, content_hash FROM image_exif WHERE library_id = ?`. Single round-trip; callers filter to video extensions client-side because the schema doesn't carry media-type. Mock impl in `files.rs` returns an empty vec. Tests in `hls_stats::tests` exercise stats_from_rows directly (videos- only filter, hash dedup, playlist vs sentinel decision, NULL-hash hashless counting) plus a publish_gauges round-trip that reads the gauge value back. Full suite (347 lib + 360 bin = 707) passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
410
src/hls_stats.rs
Normal file
410
src/hls_stats.rs
Normal file
@@ -0,0 +1,410 @@
|
||||
//! Per-library HLS readiness: Prometheus gauges + `/hls/stats` endpoint.
|
||||
//!
|
||||
//! The new hash-keyed pipeline transcodes lazily — most of a freshly
|
||||
//! mounted library is "pending" for the first hour, and operators want
|
||||
//! a live read on "how much work is left, am I CPU-bound, do I need to
|
||||
//! bump `HLS_CONCURRENCY`." This module supplies both surfaces against
|
||||
//! the same compute path:
|
||||
//!
|
||||
//! - **Prometheus gauges** `imageserver_hls_videos_total{library}`,
|
||||
//! `..._with_playlist{library}`, `..._pending{library}`,
|
||||
//! `..._unsupported{library}`. Updated every watcher full-scan tick
|
||||
//! and on every `/hls/stats` request, so the freshness matches
|
||||
//! whichever surface the operator is watching.
|
||||
//!
|
||||
//! - **`GET /hls/stats`** returns a JSON snapshot of the same counts
|
||||
//! plus a top-level cross-library aggregate. Claims-protected
|
||||
//! (matches every other authenticated read in this crate).
|
||||
//!
|
||||
//! Cost is O(distinct video hashes per library), each row needing a
|
||||
//! single `stat()` on the playlist file. On a 100k-video library that's
|
||||
//! noticeable; on a typical home library (few thousand) it's noise.
|
||||
//! We call from explicit triggers only — never per-request from
|
||||
//! middleware — so the cost is bounded.
|
||||
|
||||
use std::collections::HashSet;
|
||||
use std::path::Path;
|
||||
use std::sync::{Arc, Mutex};
|
||||
|
||||
use actix_web::{HttpResponse, Responder, get, web};
|
||||
use lazy_static::lazy_static;
|
||||
use log::{info, warn};
|
||||
use prometheus::IntGaugeVec;
|
||||
use serde::Serialize;
|
||||
|
||||
use crate::data::Claims;
|
||||
use crate::database::ExifDao;
|
||||
use crate::file_types;
|
||||
use crate::libraries::Library;
|
||||
use crate::state::AppState;
|
||||
use crate::video::hls_paths;
|
||||
|
||||
lazy_static! {
|
||||
pub static ref HLS_VIDEOS_TOTAL: IntGaugeVec = IntGaugeVec::new(
|
||||
prometheus::Opts::new(
|
||||
"imageserver_hls_videos_total",
|
||||
"Distinct video content hashes per library known to image_exif",
|
||||
),
|
||||
&["library"],
|
||||
)
|
||||
.expect("HLS_VIDEOS_TOTAL");
|
||||
pub static ref HLS_VIDEOS_WITH_PLAYLIST: IntGaugeVec = IntGaugeVec::new(
|
||||
prometheus::Opts::new(
|
||||
"imageserver_hls_videos_with_playlist",
|
||||
"Videos whose hash-keyed HLS playlist is already on disk",
|
||||
),
|
||||
&["library"],
|
||||
)
|
||||
.expect("HLS_VIDEOS_WITH_PLAYLIST");
|
||||
pub static ref HLS_VIDEOS_PENDING: IntGaugeVec = IntGaugeVec::new(
|
||||
prometheus::Opts::new(
|
||||
"imageserver_hls_videos_pending",
|
||||
"Videos whose hash-keyed HLS playlist is not yet on disk",
|
||||
),
|
||||
&["library"],
|
||||
)
|
||||
.expect("HLS_VIDEOS_PENDING");
|
||||
pub static ref HLS_VIDEOS_UNSUPPORTED: IntGaugeVec = IntGaugeVec::new(
|
||||
prometheus::Opts::new(
|
||||
"imageserver_hls_videos_unsupported",
|
||||
"Videos with an `.unsupported` sentinel — ffmpeg refused; \
|
||||
operator must delete to retry",
|
||||
),
|
||||
&["library"],
|
||||
)
|
||||
.expect("HLS_VIDEOS_UNSUPPORTED");
|
||||
}
|
||||
|
||||
/// Per-library HLS readiness snapshot.
|
||||
#[derive(Serialize, Debug, Clone, PartialEq, Eq)]
|
||||
pub struct HlsLibraryStats {
|
||||
pub library_id: i32,
|
||||
pub library: String,
|
||||
/// Distinct video content hashes (dedupes intra-library bytes-at-N-paths).
|
||||
pub total: usize,
|
||||
/// Of `total`, hashes whose `playlist.m3u8` is on disk.
|
||||
pub with_playlist: usize,
|
||||
/// Of `total`, hashes whose ffmpeg attempt left a `.unsupported`
|
||||
/// sentinel. Counted separately because they won't progress without
|
||||
/// operator intervention (delete the sentinel to retry).
|
||||
pub unsupported: usize,
|
||||
/// `total - (with_playlist + unsupported)` — videos awaiting transcode.
|
||||
pub pending: usize,
|
||||
/// Distinct rel_paths under this library that are video files but
|
||||
/// whose `image_exif.content_hash` is still NULL (mid-backfill).
|
||||
/// These don't yet count toward `total` because they're invisible
|
||||
/// to the hash-keyed pipeline; surfaced so the operator can see
|
||||
/// "hash backfill, then transcode" pipeline depth.
|
||||
pub hashless_videos: usize,
|
||||
}
|
||||
|
||||
/// JSON response body for `GET /hls/stats`.
|
||||
#[derive(Serialize, Debug)]
|
||||
pub struct HlsStatsResponse {
|
||||
pub libraries: Vec<HlsLibraryStats>,
|
||||
pub total: usize,
|
||||
pub with_playlist: usize,
|
||||
pub pending: usize,
|
||||
pub unsupported: usize,
|
||||
pub hashless_videos: usize,
|
||||
}
|
||||
|
||||
/// Compute current readiness per library and publish to Prometheus.
|
||||
/// Returns the same data so callers can serialise it. The publish step
|
||||
/// is idempotent on the gauge — old values get overwritten.
|
||||
pub fn compute_and_publish(
|
||||
libraries: &[Library],
|
||||
exif_dao: &Arc<Mutex<Box<dyn ExifDao>>>,
|
||||
video_dir: &Path,
|
||||
) -> Vec<HlsLibraryStats> {
|
||||
let ctx = opentelemetry::Context::new();
|
||||
let mut out = Vec::with_capacity(libraries.len());
|
||||
for lib in libraries {
|
||||
let stats = compute_for_library(&ctx, lib, exif_dao, video_dir);
|
||||
publish_gauges(&stats);
|
||||
out.push(stats);
|
||||
}
|
||||
out
|
||||
}
|
||||
|
||||
fn publish_gauges(s: &HlsLibraryStats) {
|
||||
HLS_VIDEOS_TOTAL
|
||||
.with_label_values(&[s.library.as_str()])
|
||||
.set(s.total as i64);
|
||||
HLS_VIDEOS_WITH_PLAYLIST
|
||||
.with_label_values(&[s.library.as_str()])
|
||||
.set(s.with_playlist as i64);
|
||||
HLS_VIDEOS_PENDING
|
||||
.with_label_values(&[s.library.as_str()])
|
||||
.set(s.pending as i64);
|
||||
HLS_VIDEOS_UNSUPPORTED
|
||||
.with_label_values(&[s.library.as_str()])
|
||||
.set(s.unsupported as i64);
|
||||
}
|
||||
|
||||
fn compute_for_library(
|
||||
ctx: &opentelemetry::Context,
|
||||
lib: &Library,
|
||||
exif_dao: &Arc<Mutex<Box<dyn ExifDao>>>,
|
||||
video_dir: &Path,
|
||||
) -> HlsLibraryStats {
|
||||
let rows = {
|
||||
let mut dao = exif_dao.lock().expect("Unable to lock ExifDao");
|
||||
match dao.list_paths_and_hashes_for_library(ctx, lib.id) {
|
||||
Ok(r) => r,
|
||||
Err(e) => {
|
||||
warn!(
|
||||
"hls_stats: list_paths_and_hashes_for_library failed for lib {}: {:?}",
|
||||
lib.id, e
|
||||
);
|
||||
Vec::new()
|
||||
}
|
||||
}
|
||||
};
|
||||
stats_from_rows(lib, &rows, video_dir)
|
||||
}
|
||||
|
||||
/// Pure function — same compute as [`compute_for_library`] but works
|
||||
/// on caller-supplied rows. Split out so tests don't need a full
|
||||
/// `ExifDao` mock; the integration path is exercised through
|
||||
/// `compute_and_publish` against the real SQLite DAO at runtime.
|
||||
fn stats_from_rows(
|
||||
lib: &Library,
|
||||
rows: &[(String, Option<String>)],
|
||||
video_dir: &Path,
|
||||
) -> HlsLibraryStats {
|
||||
let mut hashes: HashSet<String> = HashSet::new();
|
||||
let mut hashless_videos = 0usize;
|
||||
for (rel_path, hash_opt) in rows {
|
||||
if !file_types::is_video_file(Path::new(rel_path)) {
|
||||
continue;
|
||||
}
|
||||
match hash_opt {
|
||||
Some(h) => {
|
||||
hashes.insert(h.clone());
|
||||
}
|
||||
None => {
|
||||
hashless_videos += 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
let mut with_playlist = 0usize;
|
||||
let mut unsupported = 0usize;
|
||||
for h in &hashes {
|
||||
if hls_paths::playlist_for_hash(video_dir, h).exists() {
|
||||
with_playlist += 1;
|
||||
} else if hls_paths::sentinel_for_hash(video_dir, h).exists() {
|
||||
unsupported += 1;
|
||||
}
|
||||
}
|
||||
let total = hashes.len();
|
||||
let pending = total.saturating_sub(with_playlist + unsupported);
|
||||
|
||||
HlsLibraryStats {
|
||||
library_id: lib.id,
|
||||
library: lib.name.clone(),
|
||||
total,
|
||||
with_playlist,
|
||||
unsupported,
|
||||
pending,
|
||||
hashless_videos,
|
||||
}
|
||||
}
|
||||
|
||||
/// Log a single info line summarising readiness across all libraries.
|
||||
/// Called by the watcher at the end of a full-scan tick so operators
|
||||
/// who tail the log see the headline number without scraping
|
||||
/// Prometheus.
|
||||
pub fn log_summary(stats: &[HlsLibraryStats]) {
|
||||
let total: usize = stats.iter().map(|s| s.total).sum();
|
||||
let with_playlist: usize = stats.iter().map(|s| s.with_playlist).sum();
|
||||
let pending: usize = stats.iter().map(|s| s.pending).sum();
|
||||
let unsupported: usize = stats.iter().map(|s| s.unsupported).sum();
|
||||
let hashless: usize = stats.iter().map(|s| s.hashless_videos).sum();
|
||||
|
||||
let per_lib: Vec<String> = stats
|
||||
.iter()
|
||||
.map(|s| {
|
||||
format!(
|
||||
"{}={}/{} pending={} unsupported={} hashless={}",
|
||||
s.library, s.with_playlist, s.total, s.pending, s.unsupported, s.hashless_videos,
|
||||
)
|
||||
})
|
||||
.collect();
|
||||
|
||||
info!(
|
||||
"HLS readiness: {}/{} playlists on disk, {} pending, {} unsupported, {} hashless videos | per-library: [{}]",
|
||||
with_playlist,
|
||||
total,
|
||||
pending,
|
||||
unsupported,
|
||||
hashless,
|
||||
per_lib.join(", "),
|
||||
);
|
||||
}
|
||||
|
||||
#[get("/hls/stats")]
|
||||
pub async fn hls_stats_handler(
|
||||
_claims: Claims,
|
||||
app_state: web::Data<AppState>,
|
||||
exif_dao: web::Data<Mutex<Box<dyn ExifDao>>>,
|
||||
) -> impl Responder {
|
||||
let libraries = app_state.libraries.clone();
|
||||
let video_dir = std::path::PathBuf::from(&app_state.video_path);
|
||||
let exif_dao = exif_dao.into_inner();
|
||||
|
||||
// Synchronous file IO + DB query — run on a blocking pool so the
|
||||
// actix worker thread stays free for other requests.
|
||||
let stats = match web::block(move || compute_and_publish(&libraries, &exif_dao, &video_dir))
|
||||
.await
|
||||
{
|
||||
Ok(s) => s,
|
||||
Err(e) => {
|
||||
warn!("/hls/stats: blocking task failed: {:?}", e);
|
||||
Vec::new()
|
||||
}
|
||||
};
|
||||
|
||||
let total: usize = stats.iter().map(|s| s.total).sum();
|
||||
let with_playlist: usize = stats.iter().map(|s| s.with_playlist).sum();
|
||||
let pending: usize = stats.iter().map(|s| s.pending).sum();
|
||||
let unsupported: usize = stats.iter().map(|s| s.unsupported).sum();
|
||||
let hashless_videos: usize = stats.iter().map(|s| s.hashless_videos).sum();
|
||||
|
||||
HttpResponse::Ok().json(HlsStatsResponse {
|
||||
libraries: stats,
|
||||
total,
|
||||
with_playlist,
|
||||
pending,
|
||||
unsupported,
|
||||
hashless_videos,
|
||||
})
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use tempfile::tempdir;
|
||||
|
||||
fn lib(id: i32, name: &str) -> Library {
|
||||
Library {
|
||||
id,
|
||||
name: name.into(),
|
||||
root_path: String::new(),
|
||||
enabled: true,
|
||||
excluded_dirs: Vec::new(),
|
||||
}
|
||||
}
|
||||
|
||||
fn rows(vs: Vec<(&str, Option<&str>)>) -> Vec<(String, Option<String>)> {
|
||||
vs.into_iter()
|
||||
.map(|(p, h)| (p.to_string(), h.map(|s| s.to_string())))
|
||||
.collect()
|
||||
}
|
||||
|
||||
fn touch(dir: &Path, rel: &str) {
|
||||
let p = dir.join(rel);
|
||||
std::fs::create_dir_all(p.parent().unwrap()).unwrap();
|
||||
std::fs::write(p, b"").unwrap();
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn videos_only_count_in_total() {
|
||||
let tmp = tempdir().unwrap();
|
||||
let r = rows(vec![
|
||||
("photos/IMG.jpg", Some(&"a".repeat(64))), // image: ignored
|
||||
("clip.mp4", Some(&"b".repeat(64))),
|
||||
("vid.mov", Some(&"c".repeat(64))),
|
||||
]);
|
||||
let stats = stats_from_rows(&lib(1, "main"), &r, tmp.path());
|
||||
assert_eq!(stats.total, 2);
|
||||
assert_eq!(stats.with_playlist, 0);
|
||||
assert_eq!(stats.pending, 2);
|
||||
assert_eq!(stats.unsupported, 0);
|
||||
assert_eq!(stats.hashless_videos, 0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn hash_dedup_collapses_duplicate_rel_paths() {
|
||||
let tmp = tempdir().unwrap();
|
||||
let r = rows(vec![
|
||||
("a/clip.mp4", Some(&"a".repeat(64))),
|
||||
("b/clip.mp4", Some(&"a".repeat(64))), // same bytes, dup
|
||||
("other.mp4", Some(&"b".repeat(64))),
|
||||
]);
|
||||
let stats = stats_from_rows(&lib(1, "main"), &r, tmp.path());
|
||||
assert_eq!(stats.total, 2, "duplicate hashes collapse");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn playlist_existence_promotes_to_with_playlist() {
|
||||
let tmp = tempdir().unwrap();
|
||||
let hash = "a".repeat(64);
|
||||
touch(tmp.path(), &format!("aa/{}/playlist.m3u8", hash));
|
||||
|
||||
let r = rows(vec![("clip.mp4", Some(&hash))]);
|
||||
let stats = stats_from_rows(&lib(1, "main"), &r, tmp.path());
|
||||
assert_eq!(stats.total, 1);
|
||||
assert_eq!(stats.with_playlist, 1);
|
||||
assert_eq!(stats.pending, 0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn sentinel_existence_promotes_to_unsupported() {
|
||||
let tmp = tempdir().unwrap();
|
||||
let hash = "b".repeat(64);
|
||||
touch(tmp.path(), &format!("bb/{}/playlist.unsupported", hash));
|
||||
|
||||
let r = rows(vec![("clip.mov", Some(&hash))]);
|
||||
let stats = stats_from_rows(&lib(1, "main"), &r, tmp.path());
|
||||
assert_eq!(stats.total, 1);
|
||||
assert_eq!(stats.unsupported, 1);
|
||||
assert_eq!(stats.with_playlist, 0);
|
||||
assert_eq!(stats.pending, 0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn null_hash_videos_are_hashless_not_total() {
|
||||
let tmp = tempdir().unwrap();
|
||||
let r = rows(vec![
|
||||
("clip.mp4", None),
|
||||
("other.mp4", Some(&"a".repeat(64))),
|
||||
]);
|
||||
let stats = stats_from_rows(&lib(1, "main"), &r, tmp.path());
|
||||
assert_eq!(stats.total, 1, "hashless row excluded from total");
|
||||
assert_eq!(stats.hashless_videos, 1);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn publish_gauges_sets_per_library_value() {
|
||||
let s = HlsLibraryStats {
|
||||
library_id: 7,
|
||||
library: "test_publish_a".into(),
|
||||
total: 5,
|
||||
with_playlist: 2,
|
||||
pending: 3,
|
||||
unsupported: 0,
|
||||
hashless_videos: 0,
|
||||
};
|
||||
publish_gauges(&s);
|
||||
assert_eq!(
|
||||
HLS_VIDEOS_TOTAL
|
||||
.with_label_values(&["test_publish_a"])
|
||||
.get(),
|
||||
5
|
||||
);
|
||||
assert_eq!(
|
||||
HLS_VIDEOS_PENDING
|
||||
.with_label_values(&["test_publish_a"])
|
||||
.get(),
|
||||
3
|
||||
);
|
||||
assert_eq!(
|
||||
HLS_VIDEOS_WITH_PLAYLIST
|
||||
.with_label_values(&["test_publish_a"])
|
||||
.get(),
|
||||
2
|
||||
);
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user