Commit Graph

4 Commits

Author SHA1 Message Date
Cameron Cordes f707353807 feat: nightly agentic pre-generation of memory reels
Implement end-to-end nightly pre-generation of memory reels with agentic
scripting that grounds narration in calendar, location, messages, and RAG.

Sections A-E from the plan:

A. Extract produce_reel pipeline core from run_reel_job with
   ScripterMode::Fast/Agentic and progress callbacks.

B. Agentic scripter: factor run_readonly_tool_loop from the insight
   generator, build read-only tool gate, prompt builder with GPS, and
   generate_script_agentic with fallback to fast path.

C. Precomputed reels ledger (SQLite table + DAO), GET /reels/precomputed
   handler with validity gate, GET /reels/by-key/{key}/video streaming,
   and normalize_library_key helper.

D. Nightly scheduler: spawn_pregen_scheduler with configurable hour,
   run_pregen_batch (day/week/month spans), pregen_one with dedup and
   disk-check, secs_until_next_run_hour time math.

E. user_ai_prefs passive mirror table + DAO for param capture in
   create_reel_handler and replay in the scheduler.

Also fixes resolve_library_param signature to take &[Library] and adds
resolve_library_param_state wrapper for AppState callers.

New files: migrations/2026-06-13-000000_add_precomputed_reels/,
  migrations/2026-06-13-000010_add_user_ai_prefs/,
  src/database/precomputed_reel_dao.rs,
  src/database/user_ai_prefs_dao.rs
2026-06-13 14:29:34 -04:00
Cameron Cordes 65793a2dda Reels: mixed-media (video clip beats) + faster burst fade
Videos in a span now appear as clip beats: the first few seconds of the
video (capped at CLIP_SECONDS=5, and to the source length) filled to the
portrait canvas like photos, with its live audio ducked under the
narration (amix at 0.35). If the narration outlasts the clip, the last
frame is held (tpad); clips with no audio track just play under narration.

Selection splits the beat budget between photo beats and clip beats —
clips get up to half (≥1 when present), photos the rest — then merges
both back into chronological order. SegmentMedia gains a Clip variant;
beats carry `media` (photos or one clip) and the cache key tags P/C so a
path used as a still vs a clip differ.

Also drops the burst fade from 0.15s to 0.08s so a quick burst reads
clearly differently from a held shot. Bumps RENDER_VERSION.

The clip filtergraph (fill + duck-mix + last-frame hold) is unit-tested
but, like the rest of the ffmpeg path, wants a real render check on the
GPU host.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 00:02:51 -04:00
Cameron Cordes 6e90f24307 Reels: burst beats + duration budget for week/month, plus step logging
Restructures a reel around beats — one narration line over one or more
photos — instead of one line per photo. A single-photo beat is a held
shot; a multi-photo beat is a quick burst that flashes through several
moments of an event while the line is read. So a week/month reel can show
everything it spans without a narrated (and timed) segment per photo.

Selection (selector.rs):
- Duration budget: cap the number of narrated beats to ~REEL_TARGET_SECONDS
  (default 90, env-tunable) so week/month reels don't run minutes long.
- Event clustering by time gap; when there are more events than the beat
  budget, adjacent events merge so the whole span stays covered. Each beat
  bursts up to MAX_BURST_PHOTOS (an even spread), so a 40-shot dinner
  contributes a handful of quick frames, not forty narrated seconds.

Render (render.rs): a beat renders its photos as a concat of per-photo
fills (blurred-bg portrait, fps-before-fade) under one muxed narration;
burst photos get a snappier fade. beat_durations splits the narration
across the photos, stretching only if a long burst would flash too fast.

Adds high-level info logs across the steps (request → script → per-beat
narrate/render → join → done with elapsed) for visibility. Bumps
RENDER_VERSION to re-render cached reels.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 23:43:18 -04:00
Cameron Cordes e3f731b3b2 Add memory-reel backend: on-demand narrated photo slideshow
New POST /reels + GET /reels/{id} (+ /video) build an MP4 slideshow of a
memory span (day/week/month), narrated by the LLM in a cloned voice.

Pipeline (src/reels/): a selector resolves which photos + reel metadata,
the scripter writes one narration line per photo via a single LLM call
(reusing each photo's cached insight as context — no fresh vision calls,
so reel generation stays off the GPU's vision slot), each line is
synthesized to speech, and the renderer assembles stills + narration via
ffmpeg. Jobs run in the background (mirroring the TTS speech-job
registry) since a reel takes minutes; the finished MP4 is cached on disk
keyed by the selection so a repeat request is instant.

The segment model is media-typed (Photo today) so a video-clip segment
(phase 2) and a nightly pre-render (phase 3) slot in without reworking
the pipeline. Ken Burns motion is implemented but defaulted off pending a
visual check on the GPU box.

Supporting changes:
- memories: extract gather_memory_items() so the reel selector reuses the
  exact window/exclusion/tz/sort logic behind /memories.
- ai::tts: add synthesize_serialized() so reel narration honors the same
  single-GPU permit + write lease as user TTS requests.
- video::ffmpeg: make get_duration_seconds() pub for narration timing.
- AppState: reels_path (REELS_DIRECTORY, defaults beside preview clips).

Pure logic (cache key, script parsing, ffmpeg arg/filter construction,
even sampling, segment timing) is unit-tested (26 tests). The runtime
path (ffmpeg render, TTS, LLM) needs a real run on the GPU host to verify
end-to-end — not exercisable in CI.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 22:31:08 -04:00