ImageApi

Author	SHA1	Message	Date
Cameron Cordes	0a40e78528	Unified search: UNIFIED_SEARCH_MODEL env override for the translation step Pin the NL->structured translation to a small, fast model that can stay co-resident with CLIP (and the chat model) so it never evicts them on a tight VRAM budget. Precedence: UNIFIED_SEARCH_MODEL env > client-selected model > configured default. Logs the effective model (backend.model()) so model A/B tests are visible. Documented in .env.example. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-14 01:58:48 -04:00
Cameron Cordes	7e21213181	Reels: bound disk/ledger growth (pre-gen prune + on-demand cache sweep) Nothing reaped reels before, so the on-disk cache and ledger grew unbounded — each night's daily reel is a new ~4MB file + ledger row that's stale within ~26h. - Pre-gen self-prune: after recording a reel, prune_superseded keeps the newest PREGEN_KEEP_PER_SCOPE (2) rows per (span, library) and unlinks the superseded reels' mp4+sidecar. Caps the ledger/disk at ~spans×libraries×2. - On-disk sweeper (spawn_reel_cache_sweeper): every 24h, removes reel mp4s with no ledger row and no live job older than REEL_CACHE_MAX_AGE_DAYS (7) — bounding the on-demand cache, which has no ledger row and otherwise grows forever — plus crashed-render cruft (.mp4.tmp/.concat.txt/orphan sidecars). Runs regardless of REEL_PREGEN_ENABLED; disable with REEL_CACHE_SWEEP_ENABLED=0. - New DAO methods prune_superseded + all_cache_keys (with tests); env knobs documented in .env.example. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-13 23:27:32 -04:00
Cameron Cordes	5c9ee56527	Fix agentic reel audit issues: midnight bug, DAO wiring, dead code, DST timezone, validation Blocking fixes: - secs_until_next_run_hour: same-hour now returns 0 instead of 24h - capture_prefs: called at both handler return points, never fails request - capture_prefs: resolves library param, upserts to user_ai_prefs via DAO - Scheduler: uses AppState DAOs instead of separate connections - Pregen dedup: uses resolved library param instead of hardcoded 'all' - run_readonly_tool_loop: added #[allow(dead_code)] (used in main.rs only) - run_readonly_tool_loop: removed dead messages.push() call - InsightGenerator: added exif_dao() getter for scheduler reuse Medium fixes: - Input validation: run_hour clamped 0-23, week_dow clamped 0-6 - DST-sensitive timezone: fixed_tz_offset() with env var config Low fixes: - Documented REEL_PREGEN_MAX_TOOL_ITERS and REEL_PREGEN_TZ_FIXED_MINUTES - Removed dead test_app_state function and unused imports Also fix: UpsertUserAiPrefs import path, chrono::Local::with_ymd_and_hms requires TimeZone trait + .single(), unwrap_or_else closure simplification	2026-06-13 14:59:00 -04:00
Cameron Cordes	f707353807	feat: nightly agentic pre-generation of memory reels Implement end-to-end nightly pre-generation of memory reels with agentic scripting that grounds narration in calendar, location, messages, and RAG. Sections A-E from the plan: A. Extract produce_reel pipeline core from run_reel_job with ScripterMode::Fast/Agentic and progress callbacks. B. Agentic scripter: factor run_readonly_tool_loop from the insight generator, build read-only tool gate, prompt builder with GPS, and generate_script_agentic with fallback to fast path. C. Precomputed reels ledger (SQLite table + DAO), GET /reels/precomputed handler with validity gate, GET /reels/by-key/{key}/video streaming, and normalize_library_key helper. D. Nightly scheduler: spawn_pregen_scheduler with configurable hour, run_pregen_batch (day/week/month spans), pregen_one with dedup and disk-check, secs_until_next_run_hour time math. E. user_ai_prefs passive mirror table + DAO for param capture in create_reel_handler and replay in the scheduler. Also fixes resolve_library_param signature to take &[Library] and adds resolve_library_param_state wrapper for AppState callers. New files: migrations/2026-06-13-000000_add_precomputed_reels/, migrations/2026-06-13-000010_add_user_ai_prefs/, src/database/precomputed_reel_dao.rs, src/database/user_ai_prefs_dao.rs	2026-06-13 14:29:34 -04:00
Cameron Cordes	d8dd260c6b	Give TTS synthesis its own (longer) request timeout Long insights are chunked + synthesized server-side and can run past the shared 180s chat/embedding client timeout, causing spurious timeouts. /tts/speech now uses a per-request timeout from LLAMA_SWAP_TTS_REQUEST_TIMEOUT_SECONDS (default 600), overriding the client default without affecting chat/embeddings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 10:25:06 -04:00
Cameron Cordes	62d517dcda	Normalize voice-clone reference audio to WAV via ffmpeg Chatterbox validates the reference clip by file extension and rejects formats like .aac/.opus. Always transcode the reference (upload bytes and library files alike) to mono 24 kHz WAV with ffmpeg before forwarding, so any source format is accepted and the from-library audio/video paths are unified. The reference length cap is now configurable via LLAMA_SWAP_TTS_REF_SECONDS (default 30) — Chatterbox is zero-shot, so a clean ~10-20s clip is the sweet spot. Drops the now-unused mime guesser. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 22:50:08 -04:00
Cameron Cordes	35c5ecb427	Document TTS endpoints and env in README + .env.example Adds the /tts/speech and /tts/voices* endpoints plus LLAMA_SWAP_TTS_MODEL / LLAMA_SWAP_TTS_VOICE (TTS only needs LLAMA_SWAP_URL, not LLM_BACKEND=llamacpp). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 22:34:34 -04:00
Cameron Cordes	fb388c29d7	docs: update env + CLAUDE.md for direct-vision llamacpp + ResolvedBackend llamacpp models now receive images directly instead of describe-then-inline. LLAMA_SWAP_VISION_MODEL defaults to the primary model. Document the ResolvedBackend dispatch pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 15:03:12 -04:00
Cameron Cordes	be51421b38	ai: collapse llamacpp into LLM_BACKEND env switch Reverts the per-request backend="llamacpp" value. Chat/vision/embedding backend is now a deploy-time decision (LLM_BACKEND=ollama\|llamacpp), applied globally across chat, vision describe, and embeddings — so embedding vectors stay in one space across the index. - Per-request backend whitelist back to "local"\|"hybrid". A request arriving with backend="llamacpp" is rejected. - LLM_BACKEND=llamacpp swaps the entire local stack to llama-swap: chat hits the chat slot, describe hits the vision slot, embeddings hit the embed slot. Hybrid mode still routes chat to OpenRouter but uses LLM_BACKEND for the describe pass. - Drops env vars HYBRID_VISION_BACKEND, LLAMA_SWAP_VISION_MODELS, EMBEDDING_BACKEND (the last never shipped). Drops the LlamaCppClient.vision_models allowlist — capability inference now reports has_vision only for the configured vision_model slot. - Drops the /insights/llamacpp/models handler. /insights/models is the single endpoint; returns Ollama servers under LLM_BACKEND=ollama and llama-swap slots (from LLAMA_SWAP_ALLOWED_MODELS) under LLM_BACKEND=llamacpp. Same envelope shape either way. - New ai::embed_one helper routes embeddings through llama-swap when LLM_BACKEND=llamacpp (else Ollama). Wires it into the four insight_generator embedding sites. - Cross-replay matrix simplifies to pre-llamacpp shape (local↔local, hybrid↔hybrid, hybrid→local allowed; local→hybrid rejected).	2026-05-21 11:36:58 -04:00
Cameron Cordes	d14df63f19	env.example: document LLAMA_SWAP_* + HYBRID_VISION_BACKEND vars Mirrors the section added to CLAUDE.md so deploys can opt into the llamacpp backend from the template alone.	2026-05-20 17:54:08 -04:00
Cameron Cordes	ee2ed3005b	clip-search: document env knobs in .env.example APOLLO_CLIP_API_BASE_URL (falls back to APOLLO_API_BASE_URL), CLIP_BACKLOG_MAX_PER_TICK, CLIP_ENCODE_CONCURRENCY, and CLIP_REQUEST_TIMEOUT_SEC — all of which the code already reads. Apollo's side was documented earlier; this closes the parity gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 16:10:52 -04:00
Cameron Cordes	675b4a4849	faces: add .env.example template covering all documented env vars The face-recognition plan and CLAUDE.md document the full env-var surface (face detection knobs, Apollo / Ollama / OpenRouter / SMS integrations, watch intervals, RAG flags), but no example file existed — operators copying the project to a new deploy had nothing to start from. Group by section, comment out optional integrations so a minimal copy boots without external services. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 13:51:45 +00:00

12 Commits