Reels: burst beats + duration budget for week/month, plus step logging

Restructures a reel around beats — one narration line over one or more
photos — instead of one line per photo. A single-photo beat is a held
shot; a multi-photo beat is a quick burst that flashes through several
moments of an event while the line is read. So a week/month reel can show
everything it spans without a narrated (and timed) segment per photo.

Selection (selector.rs):
- Duration budget: cap the number of narrated beats to ~REEL_TARGET_SECONDS
  (default 90, env-tunable) so week/month reels don't run minutes long.
- Event clustering by time gap; when there are more events than the beat
  budget, adjacent events merge so the whole span stays covered. Each beat
  bursts up to MAX_BURST_PHOTOS (an even spread), so a 40-shot dinner
  contributes a handful of quick frames, not forty narrated seconds.

Render (render.rs): a beat renders its photos as a concat of per-photo
fills (blurred-bg portrait, fps-before-fade) under one muxed narration;
burst photos get a snappier fade. beat_durations splits the narration
across the photos, stretching only if a long burst would flash too fast.

Adds high-level info logs across the steps (request → script → per-beat
narrate/render → join → done with elapsed) for visibility. Bumps
RENDER_VERSION to re-render cached reels.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-06-12 23:43:18 -04:00
parent 740fc4d841
commit 6e90f24307
4 changed files with 580 additions and 204 deletions
+61 -34
View File
@@ -1,10 +1,11 @@
//! Narration scripting for memory reels.
//!
//! One LLM call turns the planned segments (each carrying its date and, where
//! One LLM call turns the planned beats (each carrying its date and, where
//! available, its cached insight) into a short first-person narration line per
//! photo plus a title for the reel. We reuse the cached insight summary as the
//! richest per-photo signal rather than re-running vision at reel time — that
//! keeps reel generation off the GPU's vision slot entirely.
//! beat plus a title for the reel. A beat may show several photos in a quick
//! burst, so a line narrates the *moment*, not a single frame. We reuse the
//! cached insight summary as the richest signal rather than re-running vision
//! at reel time — that keeps reel generation off the GPU's vision slot.
//!
//! The prompt builder and response parser are pure so the contract is
//! unit-testable; `generate_script` wires them to the LLM client.
@@ -12,11 +13,11 @@
use anyhow::{Context, Result};
use std::sync::Arc;
use super::{PlannedSegment, ReelMeta};
use super::{PlannedBeat, ReelMeta};
use crate::ai::llamacpp::LlamaCppClient;
use crate::ai::llm_client::LlmClient;
/// The narration for a whole reel: a title and one line per segment, in order.
/// The narration for a whole reel: a title and one line per beat, in order.
#[derive(Debug, Clone, PartialEq)]
pub struct ReelScript {
pub title: String,
@@ -26,33 +27,38 @@ pub struct ReelScript {
const SYSTEM_PROMPT: &str = "You are narrating a personal memory reel — a short \
slideshow of someone's own photos set to a spoken voiceover. Write warm, \
specific, first-person narration as if the person is gently looking back on \
their own memories. Be concrete and grounded in the details given; never \
invent names, places, or events that aren't supported. Keep each line to one \
or two short sentences that can be read aloud in a few seconds. Avoid generic \
filler like \"what a wonderful day\" — if you have little to go on, simply \
describe the moment plainly.";
their own memories. Each line plays over one moment, which may be a quick burst \
of several photos, so narrate the moment as a whole rather than a single frame. \
Be concrete and grounded in the details given; never invent names, places, or \
events that aren't supported. Keep each line to one or two short sentences that \
can be read aloud in a few seconds. Avoid generic filler like \"what a \
wonderful day\" — if you have little to go on, simply describe the moment \
plainly.";
/// Build the (system, user) prompt pair for the scripter. The user message
/// describes each segment in order and asks for strict JSON back.
pub fn build_script_messages(meta: &ReelMeta, planned: &[PlannedSegment]) -> (String, String) {
/// describes each beat in order and asks for strict JSON back.
pub fn build_script_messages(meta: &ReelMeta, beats: &[PlannedBeat]) -> (String, String) {
let mut user = String::new();
user.push_str(&format!(
"These are {} photos surfaced as memories {}.\n\n",
planned.len(),
"This reel has {} moments surfaced as memories {}.\n\n",
beats.len(),
meta.span_phrase()
));
if !meta.years.is_empty() {
let years: Vec<String> = meta.years.iter().map(|y| y.to_string()).collect();
user.push_str(&format!("They span the years: {}.\n\n", years.join(", ")));
}
user.push_str("Photos, in the order they will appear:\n");
for (i, seg) in planned.iter().enumerate() {
user.push_str("Moments, in the order they will appear:\n");
for (i, beat) in beats.iter().enumerate() {
user.push_str(&format!("\n[{}]", i + 1));
if let Some(date) = seg.date_label() {
if let Some(date) = beat.date_label() {
user.push_str(&format!(" {date}"));
}
if beat.photos.len() > 1 {
user.push_str(&format!(" (a burst of {} photos)", beat.photos.len()));
}
user.push('\n');
match (&seg.insight_title, &seg.insight_summary) {
match (&beat.insight_title, &beat.insight_summary) {
(Some(t), Some(s)) if !s.trim().is_empty() => {
user.push_str(&format!(" Known context: {t}{s}\n"));
}
@@ -65,10 +71,10 @@ pub fn build_script_messages(meta: &ReelMeta, planned: &[PlannedSegment]) -> (St
}
user.push_str(&format!(
"\nReturn ONLY a JSON object, no prose or code fences, shaped exactly:\n\
{{\"title\": \"<short reel title>\", \"segments\": [\"<line for photo 1>\", \
\"<line for photo 2>\", ... ]}}\n\
The \"segments\" array MUST have exactly {} items, one per photo in order.",
planned.len()
{{\"title\": \"<short reel title>\", \"segments\": [\"<line for moment 1>\", \
\"<line for moment 2>\", ... ]}}\n\
The \"segments\" array MUST have exactly {} items, one per moment in order.",
beats.len()
));
(SYSTEM_PROMPT.to_string(), user)
}
@@ -174,20 +180,20 @@ fn clean_text(s: &str) -> String {
trimmed.split_whitespace().collect::<Vec<_>>().join(" ")
}
/// Generate the reel script via the LLM. Text-only (no images) — the per-photo
/// Generate the reel script via the LLM. Text-only (no images) — the per-beat
/// context comes from cached insights. The call takes the GPU read lease
/// internally (see `LlamaCppClient::generate`).
pub async fn generate_script(
client: &Arc<LlamaCppClient>,
meta: &ReelMeta,
planned: &[PlannedSegment],
beats: &[PlannedBeat],
) -> Result<ReelScript> {
let (system, user) = build_script_messages(meta, planned);
let (system, user) = build_script_messages(meta, beats);
let raw = client
.generate(&user, Some(&system), None)
.await
.context("LLM script generation failed")?;
Ok(parse_script_response(&raw, planned.len()))
Ok(parse_script_response(&raw, beats.len()))
}
#[cfg(test)]
@@ -202,13 +208,13 @@ mod tests {
}
}
fn planned(n: usize) -> Vec<PlannedSegment> {
fn planned(n: usize) -> Vec<PlannedBeat> {
(0..n)
.map(|i| PlannedSegment {
media: super::super::SegmentMedia::Photo {
.map(|i| PlannedBeat {
photos: vec![super::super::SegmentMedia::Photo {
rel_path: format!("p{i}.jpg"),
library_id: 1,
},
}],
date: Some(1_560_000_000 + i as i64 * 86_400),
insight_title: None,
insight_summary: None,
@@ -217,16 +223,37 @@ mod tests {
}
#[test]
fn prompt_states_exact_segment_count_and_span() {
fn prompt_states_exact_moment_count_and_span() {
let (sys, user) = build_script_messages(&meta(), &planned(3));
assert!(sys.contains("memory reel"));
assert!(user.contains("3 photos"));
assert!(user.contains("3 moments"));
assert!(user.contains("on this day"));
assert!(user.contains("exactly 3 items"));
// Each photo gets an indexed entry.
// Each moment gets an indexed entry.
assert!(user.contains("[1]") && user.contains("[2]") && user.contains("[3]"));
}
#[test]
fn prompt_notes_burst_photo_count() {
let mut p = planned(1);
p[0].photos = vec![
super::super::SegmentMedia::Photo {
rel_path: "a.jpg".into(),
library_id: 1,
},
super::super::SegmentMedia::Photo {
rel_path: "b.jpg".into(),
library_id: 1,
},
super::super::SegmentMedia::Photo {
rel_path: "c.jpg".into(),
library_id: 1,
},
];
let (_sys, user) = build_script_messages(&meta(), &p);
assert!(user.contains("a burst of 3 photos"));
}
#[test]
fn prompt_includes_insight_context_when_present() {
let mut p = planned(1);