Add memory-reel backend: on-demand narrated photo slideshow

New POST /reels + GET /reels/{id} (+ /video) build an MP4 slideshow of a
memory span (day/week/month), narrated by the LLM in a cloned voice.

Pipeline (src/reels/): a selector resolves which photos + reel metadata,
the scripter writes one narration line per photo via a single LLM call
(reusing each photo's cached insight as context — no fresh vision calls,
so reel generation stays off the GPU's vision slot), each line is
synthesized to speech, and the renderer assembles stills + narration via
ffmpeg. Jobs run in the background (mirroring the TTS speech-job
registry) since a reel takes minutes; the finished MP4 is cached on disk
keyed by the selection so a repeat request is instant.

The segment model is media-typed (Photo today) so a video-clip segment
(phase 2) and a nightly pre-render (phase 3) slot in without reworking
the pipeline. Ken Burns motion is implemented but defaulted off pending a
visual check on the GPU box.

Supporting changes:
- memories: extract gather_memory_items() so the reel selector reuses the
  exact window/exclusion/tz/sort logic behind /memories.
- ai::tts: add synthesize_serialized() so reel narration honors the same
  single-GPU permit + write lease as user TTS requests.
- video::ffmpeg: make get_duration_seconds() pub for narration timing.
- AppState: reels_path (REELS_DIRECTORY, defaults beside preview clips).

Pure logic (cache key, script parsing, ffmpeg arg/filter construction,
even sampling, segment timing) is unit-tested (26 tests). The runtime
path (ffmpeg render, TTS, LLM) needs a real run on the GPU host to verify
end-to-end — not exercisable in CI.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

This commit is contained in:

Cameron Cordes

2026-06-12 22:31:08 -04:00

parent 98274c3301

commit e3f731b3b2

9 changed files with 1615 additions and 30 deletions

									
										src/video/ffmpeg.rs
									
		+1
		-1
	
												View File
												
				@@ -231,7 +231,7 @@ impl Ffmpeg {

				/// a hard failure — previously the `parse::<f64>` on empty stdout produced

				/// "cannot parse float from empty string" and poisoned the preview-clip row

				/// with status=failed, which the watcher would re-queue every full scan.

				async fn get_duration_seconds(input_file: &str) -> Result<Option<f64>> {

				pub async fn get_duration_seconds(input_file: &str) -> Result<Option<f64>> {

				    if let Some(d) = probe_duration(input_file, "format=duration").await? {

				        return Ok(Some(d));

				    }