Add memory-reel backend: on-demand narrated photo slideshow
New POST /reels + GET /reels/{id} (+ /video) build an MP4 slideshow of a
memory span (day/week/month), narrated by the LLM in a cloned voice.
Pipeline (src/reels/): a selector resolves which photos + reel metadata,
the scripter writes one narration line per photo via a single LLM call
(reusing each photo's cached insight as context — no fresh vision calls,
so reel generation stays off the GPU's vision slot), each line is
synthesized to speech, and the renderer assembles stills + narration via
ffmpeg. Jobs run in the background (mirroring the TTS speech-job
registry) since a reel takes minutes; the finished MP4 is cached on disk
keyed by the selection so a repeat request is instant.
The segment model is media-typed (Photo today) so a video-clip segment
(phase 2) and a nightly pre-render (phase 3) slot in without reworking
the pipeline. Ken Burns motion is implemented but defaulted off pending a
visual check on the GPU box.
Supporting changes:
- memories: extract gather_memory_items() so the reel selector reuses the
exact window/exclusion/tz/sort logic behind /memories.
- ai::tts: add synthesize_serialized() so reel narration honors the same
single-GPU permit + write lease as user TTS requests.
- video::ffmpeg: make get_duration_seconds() pub for narration timing.
- AppState: reels_path (REELS_DIRECTORY, defaults beside preview clips).
Pure logic (cache key, script parsing, ffmpeg arg/filter construction,
even sampling, segment timing) is unit-tested (26 tests). The runtime
path (ffmpeg render, TTS, LLM) needs a real run on the GPU host to verify
end-to-end — not exercisable in CI.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
+1
-1
@@ -231,7 +231,7 @@ impl Ffmpeg {
|
||||
/// a hard failure — previously the `parse::<f64>` on empty stdout produced
|
||||
/// "cannot parse float from empty string" and poisoned the preview-clip row
|
||||
/// with status=failed, which the watcher would re-queue every full scan.
|
||||
async fn get_duration_seconds(input_file: &str) -> Result<Option<f64>> {
|
||||
pub async fn get_duration_seconds(input_file: &str) -> Result<Option<f64>> {
|
||||
if let Some(d) = probe_duration(input_file, "format=duration").await? {
|
||||
return Ok(Some(d));
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user