hls: rewire queue + generator to write hash-keyed playlists

Switches the watcher → VideoPlaylistManager → PlaylistGenerator path
from the basename-keyed layout
(`$VIDEO_PATH/{basename}.m3u8`) to the hash-keyed layout
(`$VIDEO_PATH/{hash[..2]}/{hash}/playlist.m3u8`) introduced in the
prior commit. Source videos that share a basename across libraries
(or across subdirs of one library) no longer overwrite each other's
playlists. The legacy HTTP endpoints in `/video/generate` /
`/video/stream` still use the basename layout — those move in a
follow-up commit alongside the stable streaming URL.

actors.rs:
- `QueueVideosMessage.video_paths: Vec<PathBuf>` →
  `videos: Vec<VideoToQueue>`. The queue handler dedups against the
  hash-keyed playlist + sentinel and forwards `GeneratePlaylistMessage`
  carrying the hash.
- `GeneratePlaylistMessage` now carries `content_hash: String`; the
  legacy `playlist_path: String` field is gone.
- `PlaylistGenerator` takes a `video_dir: PathBuf` at construction,
  computes the hash dir + playlist + sentinel + segment template via
  `hls_paths`, `mkdir -p`s the shard/hash dir before ffmpeg runs, and
  cleans up partial output on failure by walking the hash dir.
- `ScanDirectoryMessage` and its handler are retired entirely; their
  startup-walk role is taken over by the watcher's first tick (see
  `watcher.rs` below). Dropping it avoids threading an `ExifDao` into
  `VideoPlaylistManager` just so the actor can resolve hashes.
- Legacy `playlist_file_for` / `playlist_unsupported_sentinel` are
  retained behind `#[allow(dead_code)]` for the upcoming migration
  pass that retires pre-content-hash output.

watcher.rs:
- `process_new_files` keeps `content_hash` in the EXIF-batch result
  (formerly threw it away). Videos with `image_exif.content_hash =
  NULL` — mid-backfill rows — are skipped this tick rather than
  falling back to a basename-colliding playlist; they get picked up
  after `backfill_unhashed_backlog` populates the hash on a
  subsequent tick. Skipped count is logged at debug.
- The video staleness check now uses `hls_paths::playlist_for_hash`
  instead of `$VIDEO_PATH/{basename}.m3u8`.
- `last_full_scan` initialises to `UNIX_EPOCH` so the watcher's first
  tick is treated as a full scan. That covers the catch-up gap left
  by removing `ScanDirectoryMessage` — every library's existing media
  is checked once at watcher boot (≈60s after startup) instead of
  waiting up to `WATCH_FULL_INTERVAL_SECONDS` (1h default).

main.rs: removes the `ScanDirectoryMessage` import and the per-library
`do_send` loop, with a comment pointing at the watcher's first-tick
behavior.

state.rs: `PlaylistGenerator::new` now takes the video dir.

Tests: existing `video::hls_paths` (4) and `watcher::tests` (4) pass.
The basename-keyed `/video/generate` endpoint still compiles and
serves; behavior change there is deferred to the follow-up commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-05-14 15:36:01 -04:00
parent c71e1cdce0
commit d1667099c3
4 changed files with 161 additions and 178 deletions

View File

@@ -1,8 +1,9 @@
use crate::content_hash;
use crate::database::PreviewDao;
use crate::libraries::Library;
use crate::otel::global_tracer;
use crate::thumbnails::is_video;
use crate::video::ffmpeg::{generate_preview_clip, get_duration_seconds_blocking};
use crate::video::hls_paths;
use actix::prelude::*;
use log::{debug, error, info, trace, warn};
use opentelemetry::KeyValue;
@@ -12,7 +13,6 @@ use std::path::{Path, PathBuf};
use std::process::{Child, Command, ExitStatus, Stdio};
use std::sync::{Arc, Mutex};
use tokio::sync::Semaphore;
use walkdir::{DirEntry, WalkDir};
// ffmpeg -i test.mp4 -c:v h264 -flags +cgop -g 30 -hls_time 3 out.m3u8
// ffmpeg -i "filename.mp4" -preset veryfast -c:v libx264 -f hls -hls_list_size 100 -hls_time 2 -crf 24 -vf scale=1080:-2,setsar=1:1 attempt/vid_out.m3u8
@@ -57,9 +57,11 @@ pub struct VideoToQueue {
pub content_hash: String,
}
/// Legacy basename-keyed playlist path. Used only by the one-shot startup
/// migration that retires pre-content-hash output; once cleared, all new
/// playlist writes go through [`hls_paths::playlist_for_hash`].
/// Legacy basename-keyed playlist path. Retained for the one-shot startup
/// migration that retires pre-content-hash output; new playlist writes go
/// through [`hls_paths::playlist_for_hash`]. Will be removed once the
/// migration ships and runs to completion in production.
#[allow(dead_code)]
pub fn playlist_file_for(playlist_dir: &str, video_path: &Path) -> PathBuf {
let filename = video_path
.file_name()
@@ -70,6 +72,7 @@ pub fn playlist_file_for(playlist_dir: &str, video_path: &Path) -> PathBuf {
/// Legacy basename-keyed sentinel path. Same migration-only contract as
/// [`playlist_file_for`].
#[allow(dead_code)]
pub fn playlist_unsupported_sentinel(playlist_file: &Path) -> PathBuf {
let mut s = playlist_file.as_os_str().to_owned();
s.push(".unsupported");
@@ -342,17 +345,17 @@ async fn get_max_gop_seconds(video_path: &str) -> Option<f64> {
}
pub struct VideoPlaylistManager {
playlist_dir: PathBuf,
video_dir: PathBuf,
playlist_generator: Addr<PlaylistGenerator>,
}
impl VideoPlaylistManager {
pub fn new<P: Into<PathBuf>>(
playlist_dir: P,
video_dir: P,
playlist_generator: Addr<PlaylistGenerator>,
) -> Self {
Self {
playlist_dir: playlist_dir.into(),
video_dir: video_dir.into(),
playlist_generator,
}
}
@@ -362,144 +365,68 @@ impl Actor for VideoPlaylistManager {
type Context = Context<Self>;
}
impl Handler<ScanDirectoryMessage> for VideoPlaylistManager {
type Result = ResponseFuture<()>;
fn handle(&mut self, msg: ScanDirectoryMessage, _ctx: &mut Self::Context) -> Self::Result {
let tracer = global_tracer();
let mut span = tracer.start("videoplaylistmanager.scan_directory");
let start = std::time::Instant::now();
info!(
"Starting scan directory for video playlist generation: {}",
msg.directory
);
let playlist_output_dir = self.playlist_dir.clone();
let playlist_dir_str = playlist_output_dir.to_str().unwrap().to_string();
let video_files = WalkDir::new(&msg.directory)
.into_iter()
.filter_map(|e| e.ok())
.filter(|e| e.file_type().is_file())
.filter(is_video)
.filter(|e| {
let playlist = playlist_file_for(&playlist_dir_str, e.path());
!playlist.exists() && !playlist_unsupported_sentinel(&playlist).exists()
})
.collect::<Vec<DirEntry>>();
let scan_dir_name = msg.directory.clone();
let playlist_generator = self.playlist_generator.clone();
Box::pin(async move {
for e in video_files {
let path = e.path();
let path_as_str = path.to_str().unwrap();
debug!(
"Sending generate playlist message for path: {}",
path_as_str
);
match playlist_generator
.send(GeneratePlaylistMessage {
playlist_path: playlist_output_dir.to_str().unwrap().to_string(),
video_path: PathBuf::from(path),
})
.await
.expect("Failed to send generate playlist message")
{
Ok(_) => {
span.add_event(
"Playlist generated",
vec![KeyValue::new("video_path", path_as_str.to_string())],
);
debug!(
"Successfully generated playlist for file: '{}'",
path_as_str
);
}
Err(e) if e.kind() == std::io::ErrorKind::AlreadyExists => {
debug!("Playlist already exists for '{:?}', skipping", path);
}
Err(e) => {
warn!("Failed to generate playlist for path '{:?}'. {:?}", path, e);
}
}
}
span.add_event(
"Finished directory scan",
vec![KeyValue::new("directory", scan_dir_name.to_string())],
);
info!(
"Finished directory scan of '{}' in {:?}",
scan_dir_name,
start.elapsed()
);
})
}
}
impl Handler<QueueVideosMessage> for VideoPlaylistManager {
type Result = ();
fn handle(&mut self, msg: QueueVideosMessage, _ctx: &mut Self::Context) -> Self::Result {
if msg.video_paths.is_empty() {
if msg.videos.is_empty() {
return;
}
info!(
"Queueing {} videos for HLS playlist generation",
msg.video_paths.len()
);
let playlist_output_dir = self.playlist_dir.clone();
let playlist_dir_str = playlist_output_dir.to_str().unwrap().to_string();
let video_dir = self.video_dir.clone();
let playlist_generator = self.playlist_generator.clone();
for video_path in msg.video_paths {
let playlist = playlist_file_for(&playlist_dir_str, &video_path);
if playlist.exists() || playlist_unsupported_sentinel(&playlist).exists() {
let mut queued = 0usize;
let mut already_present = 0usize;
for VideoToQueue {
video_path,
content_hash,
} in msg.videos
{
let playlist = hls_paths::playlist_for_hash(&video_dir, &content_hash);
let sentinel = hls_paths::sentinel_for_hash(&video_dir, &content_hash);
if playlist.exists() || sentinel.exists() {
already_present += 1;
continue;
}
let path_str = video_path.to_string_lossy().to_string();
debug!("Queueing playlist generation for: {}", path_str);
debug!(
"Queueing playlist generation for {} (hash={})",
video_path.display(),
short_hash(&content_hash)
);
playlist_generator.do_send(GeneratePlaylistMessage {
playlist_path: playlist_dir_str.clone(),
video_path,
content_hash,
});
queued += 1;
}
info!(
"Queue tick: {} queued, {} skipped (playlist or sentinel already on disk)",
queued, already_present
);
}
}
#[derive(Message)]
#[rtype(result = "()")]
pub struct ScanDirectoryMessage {
pub(crate) directory: String,
}
#[derive(Message)]
#[rtype(result = "()")]
pub struct QueueVideosMessage {
pub video_paths: Vec<PathBuf>,
pub videos: Vec<VideoToQueue>,
}
#[derive(Message)]
#[rtype(result = "Result<()>")]
pub struct GeneratePlaylistMessage {
pub video_path: PathBuf,
pub playlist_path: String,
pub content_hash: String,
}
pub struct PlaylistGenerator {
semaphore: Arc<Semaphore>,
video_dir: PathBuf,
}
impl PlaylistGenerator {
pub(crate) fn new() -> Self {
pub(crate) fn new<P: Into<PathBuf>>(video_dir: P) -> Self {
// Concurrency is tunable via HLS_CONCURRENCY so operators can dial
// it to their hardware: 1 on weak Synology boxes to avoid thermal
// throttling, higher on desktops with spare cores.
@@ -511,6 +438,7 @@ impl PlaylistGenerator {
info!("PlaylistGenerator: concurrency={}", concurrency);
PlaylistGenerator {
semaphore: Arc::new(Semaphore::new(concurrency)),
video_dir: video_dir.into(),
}
}
}
@@ -524,20 +452,23 @@ impl Handler<GeneratePlaylistMessage> for PlaylistGenerator {
fn handle(&mut self, msg: GeneratePlaylistMessage, _ctx: &mut Self::Context) -> Self::Result {
let video_file = msg.video_path.to_str().unwrap().to_owned();
let playlist_path = msg.playlist_path.as_str().to_owned();
let content_hash_str = msg.content_hash.clone();
let semaphore = self.semaphore.clone();
let video_dir = self.video_dir.clone();
let playlist_file = format!(
"{}/{}.m3u8",
playlist_path,
msg.video_path.file_name().unwrap().to_str().unwrap()
);
let hash_dir = content_hash::hls_dir(&video_dir, &content_hash_str);
let playlist_path = hls_paths::playlist_for_hash(&video_dir, &content_hash_str);
let sentinel_path = hls_paths::sentinel_for_hash(&video_dir, &content_hash_str);
let segment_template = hls_paths::segment_template_for_hash(&video_dir, &content_hash_str);
let playlist_file = playlist_path.to_string_lossy().to_string();
let segment_pattern = segment_template.to_string_lossy().to_string();
let tracer = global_tracer();
let mut span = tracer
.span_builder("playlistgenerator.generate_playlist")
.with_attributes(vec![
KeyValue::new("video_file", video_file.clone()),
KeyValue::new("content_hash", content_hash_str.clone()),
KeyValue::new("playlist_file", playlist_file.clone()),
])
.start(&tracer);
@@ -561,7 +492,7 @@ impl Handler<GeneratePlaylistMessage> for PlaylistGenerator {
)],
);
if Path::new(&playlist_file).exists() {
if playlist_path.exists() {
debug!("Playlist already exists: {}", playlist_file);
span.set_status(Status::error(format!(
"Playlist already exists: {}",
@@ -570,6 +501,19 @@ impl Handler<GeneratePlaylistMessage> for PlaylistGenerator {
return Err(std::io::Error::from(std::io::ErrorKind::AlreadyExists));
}
// Ensure the shard + hash directory exist. Idempotent — the
// dir may already be present from a prior attempt that wrote
// a sentinel before being cleared for retry.
if let Err(e) = tokio::fs::create_dir_all(&hash_dir).await {
error!(
"Failed to create HLS hash dir {}: {}",
hash_dir.display(),
e
);
span.set_status(Status::error(format!("mkdir failed: {}", e)));
return Err(e);
}
// One ffprobe call for codec + rotation metadata.
let stream_meta = probe_video_stream_meta(&video_file).await;
let is_h264 = stream_meta.is_h264;
@@ -630,16 +574,11 @@ impl Handler<GeneratePlaylistMessage> for PlaylistGenerator {
span.add_event("Transcoding to h264", vec![]);
}
// Encode to a .tmp playlist and explicit segment names so a failed
// encode leaves predictable artifacts we can clean up — and so a
// concurrent scan doesn't see a half-written .m3u8 as "done".
// Encode to a .tmp playlist alongside the final inside the
// hash dir, so a concurrent scan never sees a half-written
// .m3u8 as "done". Segments use the hash-keyed template;
// ffmpeg writes them next to the playlist (relative refs).
let playlist_tmp = format!("{}.tmp", playlist_file);
let video_stem = msg
.video_path
.file_name()
.and_then(|n| n.to_str())
.unwrap_or("video");
let segment_pattern = format!("{}/{}_%03d.ts", playlist_path, video_stem);
let mut cmd = tokio::process::Command::new("ffmpeg");
cmd.arg("-y").arg("-i").arg(&video_file);
@@ -728,12 +667,12 @@ impl Handler<GeneratePlaylistMessage> for PlaylistGenerator {
let success = matches!(&ffmpeg_result, Ok(out) if out.status.success());
if success {
if let Err(e) = tokio::fs::rename(&playlist_tmp, &playlist_file).await {
if let Err(e) = tokio::fs::rename(&playlist_tmp, &playlist_path).await {
error!(
"ffmpeg succeeded but rename {} -> {} failed: {}",
playlist_tmp, playlist_file, e
);
cleanup_partial_hls(&playlist_tmp, playlist_path.as_str(), video_stem).await;
cleanup_partial_hls(&hash_dir).await;
span.set_status(Status::error(format!("rename failed: {}", e)));
return Err(e);
}
@@ -750,18 +689,17 @@ impl Handler<GeneratePlaylistMessage> for PlaylistGenerator {
Err(e) => format!("ffmpeg failed: {}", e),
};
error!("ffmpeg failed for {}: {}", video_file, detail);
cleanup_partial_hls(&playlist_tmp, playlist_path.as_str(), video_stem).await;
let sentinel = playlist_unsupported_sentinel(Path::new(&playlist_file));
if let Err(se) = tokio::fs::write(&sentinel, b"").await {
cleanup_partial_hls(&hash_dir).await;
if let Err(se) = tokio::fs::write(&sentinel_path, b"").await {
warn!(
"Failed to write playlist sentinel {}: {}",
sentinel.display(),
sentinel_path.display(),
se
);
} else {
info!(
"Wrote playlist sentinel {} so future scans skip {}",
sentinel.display(),
sentinel_path.display(),
video_file
);
}
@@ -772,29 +710,47 @@ impl Handler<GeneratePlaylistMessage> for PlaylistGenerator {
}
}
/// Delete the temp playlist and any segment files that ffmpeg may have written
/// before failing. Called both on ffmpeg error and on rename failure so a
/// retry on the next scan starts from a clean slate.
async fn cleanup_partial_hls(playlist_tmp: &str, playlist_dir: &str, video_stem: &str) {
let _ = tokio::fs::remove_file(playlist_tmp).await;
let segment_prefix = format!("{}_", video_stem);
let Ok(mut entries) = tokio::fs::read_dir(playlist_dir).await else {
/// Delete the partial playlist (.tmp) and any segment files left behind by
/// a failed ffmpeg run. Wipes every non-sentinel file in the hash dir;
/// retains the sentinel if one has already been written by an earlier
/// caller in the same path (today there is none, but kept defensively so
/// the function is safe to call after sentinel write too).
async fn cleanup_partial_hls(hash_dir: &Path) {
let Ok(mut entries) = tokio::fs::read_dir(hash_dir).await else {
return;
};
while let Ok(Some(entry)) = entries.next_entry().await {
let Some(name) = entry.file_name().to_str().map(str::to_owned) else {
let path = entry.path();
let is_sentinel = path
.file_name()
.and_then(|n| n.to_str())
.map(|n| n == hls_paths::UNSUPPORTED_SENTINEL_FILENAME)
.unwrap_or(false);
if is_sentinel {
continue;
};
if name.starts_with(&segment_prefix)
&& name.ends_with(".ts")
&& let Err(e) = tokio::fs::remove_file(entry.path()).await
{
warn!("Failed to remove partial segment {}: {}", name, e);
}
if let Err(e) = tokio::fs::remove_file(&path).await {
warn!(
"Failed to remove partial HLS file {}: {}",
path.display(),
e
);
}
}
}
/// First 16 chars of a content hash for log lines. Short enough to keep
/// log volume sane, long enough that distinct hashes don't collide in
/// practice.
fn short_hash(hash: &str) -> &str {
let end = hash
.char_indices()
.nth(16)
.map(|(i, _)| i)
.unwrap_or(hash.len());
&hash[..end]
}
#[derive(Message)]
#[rtype(result = "()")]
pub struct GeneratePreviewClipMessage {