hls: hash-keyed HTTP routes for /video/generate and serving

`POST /video/generate` is reshaped to return a JSON object instead of
a bare string. New fields:

- `playlist_url`: stable hash-keyed URL of the form
  `/video/hls/<hash>/playlist.m3u8`. Use this with hls.js / native
  players — relative segment refs inside the playlist resolve to
  `/video/hls/<hash>/segment_NNN.ts` because the URL is path-based.
- `content_hash`: the blake3 hex digest that identifies the bytes.
  Stable across libraries, archive ingests, renames; clients can
  cache the URL by hash.
- `ready`: true iff the playlist file is already on disk. False means
  a transcode was just queued; the client should retry the URL after
  a short delay (or rely on hls.js's built-in retry).
- `playlist` (legacy): basename-keyed path string, echoed under the
  old field name so clients that destructure `response.playlist` keep
  working during the rollout. The startup migration deletes the
  underlying file, so this URL will 404; clients should migrate to
  `playlist_url`. Field is slated for removal once Apollo / File
  Viewer ship the update.

The handler:
- resolves the source path across libraries (same logic as before),
- looks up `image_exif.content_hash` for that (library_id, rel_path),
- falls back to inline `content_hash::compute` when the row is mid-
  backfill — pure read, no library mutation,
- sends a single-element `QueueVideosMessage` to `VideoPlaylistManager`
  if the playlist isn't already on disk and there's no
  `playlist.unsupported` sentinel,
- returns the URL immediately. The actor pipeline owns transcoding.

New route `GET /video/hls/{hash}/{file}`:
- strict validation: hash must be 64 ascii-hex chars; file must be
  `playlist.m3u8` or `segment_NNN.ts` (digits only). Anything else
  returns 400 so we never have to rely on path canonicalisation
  alone to defend against traversal,
- belt-and-suspenders canonicalize() guard verifies the resolved
  file lives under `$VIDEO_PATH`,
- serves with the standard `NamedFile::into_response` machinery.

Cleanup in `actors.rs`:
- `ProcessMessage` + its `StreamActor` handler had no senders after
  the rewire — removed. `StreamActor` itself stays (still handles
  `RefreshThumbnailsMessage` from `files.rs`).
- `create_playlist`, `playlist_file_for`,
  `playlist_unsupported_sentinel` are gone — the legacy on-demand
  transcode helper and the migration-only path helpers had no
  remaining users (the migration uses its own classify() function).
- Imports tightened: dropped `Child`, `ExitStatus`, `trace`.

Tests cover both new validators (`is_valid_hash`,
`is_allowed_hls_filename`) including the strings that motivated the
defence-in-depth (traversal attempts, internal `.tmp`/`.unsupported`
artifacts, malformed segment names).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-05-14 15:51:01 -04:00
parent 78fabc2b32
commit 7c153596fe
3 changed files with 297 additions and 144 deletions

View File

@@ -5,12 +5,12 @@ use crate::otel::global_tracer;
use crate::video::ffmpeg::{generate_preview_clip, get_duration_seconds_blocking};
use crate::video::hls_paths;
use actix::prelude::*;
use log::{debug, error, info, trace, warn};
use log::{debug, error, info, warn};
use opentelemetry::KeyValue;
use opentelemetry::trace::{Span, Status, Tracer};
use std::io::Result;
use std::path::{Path, PathBuf};
use std::process::{Child, Command, ExitStatus, Stdio};
use std::process::{Command, Stdio};
use std::sync::{Arc, Mutex};
use tokio::sync::Semaphore;
// ffmpeg -i test.mp4 -c:v h264 -flags +cgop -g 30 -hls_time 3 out.m3u8
@@ -22,31 +22,6 @@ impl Actor for StreamActor {
type Context = Context<Self>;
}
pub struct ProcessMessage(pub String, pub Child);
impl Message for ProcessMessage {
type Result = Result<ExitStatus>;
}
impl Handler<ProcessMessage> for StreamActor {
type Result = Result<ExitStatus>;
fn handle(&mut self, msg: ProcessMessage, _ctx: &mut Self::Context) -> Self::Result {
trace!("Message received");
let mut process = msg.1;
let result = process.wait();
debug!(
"Finished waiting for: {:?}. Code: {:?}",
msg.0,
result
.as_ref()
.map_or(-1, |status| status.code().unwrap_or(-1))
);
result
}
}
/// A video paired with its content hash, ready to be queued for HLS
/// playlist generation. Hash is required because all output paths are
/// keyed on it; callers that lack a hash (rows mid-backfill) must skip
@@ -57,70 +32,6 @@ pub struct VideoToQueue {
pub content_hash: String,
}
/// Legacy basename-keyed playlist path. Retained for the one-shot startup
/// migration that retires pre-content-hash output; new playlist writes go
/// through [`hls_paths::playlist_for_hash`]. Will be removed once the
/// migration ships and runs to completion in production.
#[allow(dead_code)]
pub fn playlist_file_for(playlist_dir: &str, video_path: &Path) -> PathBuf {
let filename = video_path
.file_name()
.and_then(|n| n.to_str())
.unwrap_or("unknown");
PathBuf::from(format!("{}/{}.m3u8", playlist_dir, filename))
}
/// Legacy basename-keyed sentinel path. Same migration-only contract as
/// [`playlist_file_for`].
#[allow(dead_code)]
pub fn playlist_unsupported_sentinel(playlist_file: &Path) -> PathBuf {
let mut s = playlist_file.as_os_str().to_owned();
s.push(".unsupported");
PathBuf::from(s)
}
pub async fn create_playlist(video_path: &str, playlist_file: &str) -> Result<Child> {
if Path::new(playlist_file).exists() {
debug!("Playlist already exists: {}", playlist_file);
return Err(std::io::Error::from(std::io::ErrorKind::AlreadyExists));
}
let result = Command::new("ffmpeg")
.arg("-i")
.arg(video_path)
.arg("-c:v")
.arg("h264")
.arg("-crf")
.arg("21")
.arg("-preset")
.arg("veryfast")
.arg("-hls_time")
.arg("3")
.arg("-hls_list_size")
.arg("0")
.arg("-hls_playlist_type")
.arg("vod")
.arg("-vf")
.arg("scale='min(1080,iw)':-2,setsar=1:1")
.arg(playlist_file)
.stdout(Stdio::null())
.stderr(Stdio::null())
.spawn();
let start_time = std::time::Instant::now();
loop {
actix::clock::sleep(std::time::Duration::from_secs(1)).await;
if Path::new(playlist_file).exists()
|| std::time::Instant::now() - start_time > std::time::Duration::from_secs(5)
{
break;
}
}
result
}
pub fn generate_video_thumbnail(path: &Path, destination: &Path) -> std::io::Result<()> {
// Probe duration up front and seek to ~50% — gives a more
// representative frame than a fixed offset (skipping title cards on