diff --git a/CLAUDE.md b/CLAUDE.md index 3567915..1e34b44 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -658,7 +658,12 @@ clients whether chat is available for a given insight. - `POST /insights/chat` runs one turn of the agentic loop against the replayed history. Body: `{ file_path, library?, user_message, model?, backend?, num_ctx?, - temperature?, top_p?, top_k?, min_p?, max_iterations?, amend? }`. + temperature?, top_p?, top_k?, min_p?, max_iterations?, system_prompt?, amend? }`. + `system_prompt` is a per-turn override: in append mode (default) it's applied + ephemerally — the original system message is restored before persistence so + the stored transcript keeps its baked persona. In amend mode the override + stays in place and becomes the new insight row's system message. Mirrors the + internal `annotate_system_with_budget` swap-and-restore pattern. - `POST /insights/chat/stream` is the SSE variant — same request body, response is `text/event-stream` with events: `iteration_start`, `text` (delta), `tool_call`, `tool_result`, `truncated`, `done`, plus a server-emitted `error_message` on diff --git a/specs/002-insight-chat-improvements.md b/specs/002-insight-chat-improvements.md new file mode 100644 index 0000000..4efa9e3 --- /dev/null +++ b/specs/002-insight-chat-improvements.md @@ -0,0 +1,392 @@ +# Insight Chat improvements — design + +**Date:** 2026-05-07 +**Branch:** `feature/insight-chat-improvements` (in both `ImageApi/` and `FileViewer-React/`) +**Scope:** ImageApi photo-anchored insight + chat surface, plus the +FileViewer-React client. Apollo's free/visit chat is **not** in this cycle. + +## Problem + +Three concrete gaps in today's insight + chat surface: + +1. **Tool drift.** ImageApi exposes 13 tools to the LLM. Some are gated on + `apollo_enabled` / `has_vision`, but several optional ones + (`search_rag`, `get_calendar_events`, `get_location_history`) are + registered unconditionally even when their backing tables are empty. + Descriptions vary in quality and a couple have outright bugs. +2. **Inconsistent / incomplete tool descriptions.** Tools like + `search_messages` describe their selection rules but omit useful + examples; `store_fact` doesn't show the `object_entity_id` vs + `object_value` choice; `get_sms_messages` accepts a `days_radius` + parameter that the backing client silently ignores. The LLM is being + instructed against a slightly wrong reality. +3. **System prompt fights the persona.** Today's generation prompt + prepends the user's `custom_system_prompt` and then immediately asserts + `"You are a personal photo memory assistant..."`. The user message + demands `"a detailed insight with a title and summary"`. Both + contradict whatever voice / shape / POV the persona just established. + On chat continuation the persona is baked into the stored transcript at + generation time and can't be changed without regenerating. + +## Goals + +- Tool catalog is **representative** — every tool registered for a turn is + backed by data the user actually has. +- Tool descriptions are **concise but complete**, with examples for any + tool whose param choice has multiple modes or non-obvious interactions. +- Persona / system prompt is **authoritative** for voice, length, and + shape — both at generation and during chat continuation. +- Per-turn system prompt overrides on chat work without surprising + side-effects on the stored transcript outside `amend` mode. + +## Non-goals + +- Apollo backend / frontend changes. Separate cycle. +- Refactoring the `generate_photo_title` post-hoc title flow. Already + takes `custom_system_prompt`. +- Tool consolidation (e.g. merging `search_messages` + `get_sms_messages`). + Considered and deferred — keeps blast radius small. +- Removing knowledge-memory tools (`recall_*` / `store_*`). Audit + confirmed they have a live read path via `knowledge.rs` HTTP routes. +- Persisting persona changes to the stored transcript outside `amend` + mode. Deliberate — re-opens use the persona currently active in the + client, not a sticky historical setting. + +--- + +## Design + +### A. System prompt — generation + +Today (`insight_generator.rs:3305–3326`): + +``` +[custom_system_prompt if any] + +"You are a personal photo memory assistant helping to reconstruct..." + +{owner_id_note} + +{fewshot_block} + +"IMPORTANT INSTRUCTIONS: +1. You MUST call multiple tools... +2. When calling get_sms_messages and search_rag... +3. Use recall_facts_for_photo... +... +8. You have a hard budget of {max_iterations} iterations..." +``` + +The first concatenation is the bug: `custom` claims one identity, the +next line asserts another. + +**New structure** — two named blocks, in order: + +``` +[Identity / voice / format block] ← persona-controlled (or neutral default) +[Procedural block] ← always identity-free +``` + +**Identity block:** +- When `custom_system_prompt` is supplied: use that string verbatim, no + pre/append. +- When not: a neutral default that doesn't fight a future persona. + Working text: `"You are reconstructing a memory from a photo. Use the + gathered context to write a thoughtful summary; you decide voice, + length, and shape."` + +**Procedural block** — identity-free, always emitted: + +``` +Tool-use guidance: +- You have a budget of {max_iterations} tool-calling iterations. +- Call tools to gather context BEFORE writing your final answer; don't + answer after one or two calls. +- When calling get_sms_messages or search_rag, make at least one call + WITHOUT a contact filter — surrounding events matter even when a + contact is known. +- Use recall_facts_for_photo + recall_entities to load any prior + knowledge about subjects in the photo. +- When you identify people / places / events / things, use store_entity + + store_fact to grow the persistent memory. +- A tool returning no results is informative; continue with the others. + +{owner_id_note if applicable} +{fewshot_block if applicable} +``` + +Differences from today's "IMPORTANT INSTRUCTIONS" block: removed the +"you are a personal photo memory assistant" framing and the explicit +"at least 5 tool calls" floor (replaced with the softer "don't answer +after one or two"). Few-shot stays — it's pattern-of-tool-use, not +identity. + +### B. User message — generation + +Today (line 3357): + +``` +{visual_block}Please analyze this photo and gather any relevant context +from the surrounding weeks. + +Photo file path: {file_path} +Date taken: {date} +{contact_info} +{gps_info} +{tags_info} + +Use the available tools to gather more context about this moment +(messages, calendar events, location history, etc.), then write a +detailed insight with a title and summary. +``` + +Problems: the trailing line bakes in output shape ("title and +summary"), and the title from the resulting response is **discarded +anyway** — `generate_photo_title` (line 3494) regenerates the title +post-hoc from the summary. So the prompt is constraining voice for no +data-model benefit. + +**New payload** — context-only, no output prescription: + +``` +{visual_block}Photo file path: {file_path} +Date taken: {date} +{contact_info} +{gps_info} +{tags_info} + +Gather context with the available tools, then respond. +``` + +The persona owns shape. If a user wants "title-then-paragraph" output, +their persona prompt says so. + +### C. System prompt — chat continuation + +Add `system_prompt: Option` to `ChatTurnRequest` (and to its +HTTP wrapper `ChatTurnHttpRequest`). It carries through both the +non-streaming `chat_turn` and the streaming `chat_turn_stream`. + +**Append mode (default, `amend=false`)** — ephemeral +swap-and-restore, mirroring the existing `annotate_system_with_budget` +pattern: + +1. Load stored transcript. +2. If `system_prompt` is `Some(s)`: + - If first message is a `system` role: stash original content, + replace with `s`. + - Else: prepend a synthetic ephemeral system message with `s` (note + it's synthetic so the restore step pops it rather than rewriting). +3. Run `annotate_system_with_budget` on top (existing per-turn budget + note appends to whatever's there now). +4. Run the agentic loop. +5. **Before persistence**, restore the original system content (or pop + the synthetic one). Run `restore_system_content` for the budget + annotation as today. +6. Save. + +Result: the model sees the override; the stored transcript is +unchanged outside the model's actual reply. + +**Amend mode (`amend=true`)**: + +- If `system_prompt` is supplied: the override stays in place during + the serialization for the new insight row. The new row's + `training_messages` system message is the override. `is_current=false` + flips on prior rows as today. +- If not supplied: behaves as today (stored transcript's system message + carries forward unchanged). + +### D. FileViewer-React — client wiring + +`hooks/useInsightChat.tsx`: +- `SendTurnOptions` gains `systemPromptOverride?: string | null`. +- Inside `sendTurn`, before issuing the streaming POST: + 1. Read the active persona's `systemPrompt` from AsyncStorage + (already loaded for generation flows — reuse the same accessor). + 2. If a one-shot `systemPromptOverride` is set, append as a suffix + (`${persona}\n\n${override}`) so persona voice survives + override + tweaks the turn. + 3. Include the resulting string as `system_prompt` on the request body. +- No history-load change. The history endpoint still returns the stored + transcript. + +`components/InsightChatModal.tsx`: +- Add a small "Style note" composer affordance — a one-shot text input + that, when filled, becomes the `systemPromptOverride` for the next + send. Cleared after send. +- The existing persona chip continues to open `PersonaManagerModal`. + +`hooks/usePersonas.tsx` and the bundled defaults: +- Built-in `assistant` and `journal` prompts get audited and rewritten + to **explicitly state voice / shape / length** — since the framework + no longer guarantees a default shape, the persona must. + +### E. Tool catalog — gating + +Widen `build_tool_definitions` from `(has_vision: bool, apollo_enabled: +bool)` to a single `ToolGateOpts` struct: + +```rust +pub struct ToolGateOpts { + pub has_vision: bool, + pub apollo_enabled: bool, + pub daily_summaries_present: bool, + pub calendar_present: bool, + pub location_history_present: bool, +} +``` + +The chat / generation services compute the three new fields lazily per +turn via `SELECT 1 FROM LIMIT 1` (cheap; cached for the turn's +duration). Lazy because operators import data after launch and we don't +want to require a restart for the LLM to discover its new capabilities. + +Per-tool gating: + +| Tool | Existing gate | New gate | +|---|---|---| +| `describe_photo` | `has_vision` | unchanged | +| `get_personal_place_at` | `apollo_enabled` | unchanged | +| `get_calendar_events` | none | `calendar_present` | +| `get_location_history` | none | `location_history_present` | +| `search_rag` | none | `daily_summaries_present` | + +All other tools always-on. (`get_sms_messages` and `search_messages` +fail informatively if SMS-API is unreachable; not worth a startup probe +since intermittent failures are the same shape.) + +### F. Tool descriptions — convention + +Every description follows: + +1. One sentence: **what** + **when to call**. +2. Param semantics worth knowing (units, ranges, mode behavior, + precedence). +3. **Example invocation** for tools with multiple modes, optional bands, + or non-obvious parameter interactions. +4. Cross-references when relevant: `prefer X when both apply`. + +Banned: all-caps section headers inside descriptions +(`"CONTENT search"`, `"TIME-BASED fetch"`); persona-prescriptive language +(`"you are a..."`); behavioral references to other tools by description +rather than name. + +Tools getting examples: `search_messages`, `search_rag`, `store_fact`, +`get_sms_messages`. Trivial tools (`get_current_datetime`, +`reverse_geocode`, `get_file_tags`) skip the example. + +Sample (`search_messages`): + +> Search SMS/MMS message bodies. Modes: `fts5` (keyword + phrase + prefix +> + AND/OR/NOT + NEAR proximity), `semantic` (embedding similarity, +> requires generated embeddings), `hybrid` (RRF merge, recommended; +> degrades to `fts5` when embeddings absent). Optional `start_ts` / +> `end_ts` (real-UTC unix seconds) and `contact_id` filters. For pure +> date / contact browsing without keywords, prefer `get_sms_messages`. +> +> Examples: +> - `{query: "trader joe's"}` — phrase across all time. +> - `{query: "dinner", contact_id: 42, start_ts: 1700000000, end_ts: 1700604800}` +> — keyword within a contact and a week. +> - `{query: "NEAR(meeting work, 5)"}` — proximity search. + +### G. SMS tool fixes + +#### `get_sms_messages` — honor `days_radius` + +Today: `sms_client::fetch_messages_for_contact(contact, center_ts)` +hardcodes `Duration::days(4)` (lines 31–37). The tool accepts +`days_radius` and silently ignores it. + +**Fix:** widen the signature to +`fetch_messages_for_contact(contact, center_ts, days_radius)`. Tool +plumbs through. Default 4 retained for back-compat. + +#### `search_messages` — add date and contact_id filters + +Today: ImageApi's `search_messages` only forwards `query`, `mode`, +`limit` to SMS-API. + +**Fix:** add `start_ts`, `end_ts`, `contact_id` parameters. +- `contact_id` forwards directly to SMS-API + (`/api/messages/search/?contact_id=`). +- `start_ts` / `end_ts` are not natively accepted by SMS-API's search + endpoint. Apply client-side post-filter on the response (Apollo's + pattern: `chat_tools.py:670–680`). Bump the SMS-API `limit` to a + larger fetch pool when a date filter is supplied so in-window matches + aren't lost to out-of-window FTS rank. + +--- + +## Implementation sequencing + +Each step is independently mergeable. + +### ImageApi PRs + +1. **Split system-prompt assembly + neutralize user message.** Two + named blocks; user message context-only. Default identity string + added. Tests: golden snapshots of the resulting `system_content` + with and without `custom_system_prompt`. +2. **`system_prompt` field on chat request + swap/restore + amend + persistence.** Mirrors `annotate_system_with_budget` pattern. Tests: + round-trip system content unchanged in append mode; persisted in + amend mode. +3. **`fetch_messages_for_contact` honors `days_radius`.** Tool wires + the param through. Tests: window math at the client level. +4. **`ToolGateOpts` + per-tool description rewrites.** Description + text changes are the bulk of the diff but no behavior change beyond + gating. + +### FileViewer-React PR + +5. **Chat hook sends `system_prompt`; modal gets style-note input; + built-in personas updated to specify shape.** The + `useInsightChat.sendTurn` call site picks up the persona and + includes it on every chat turn body. Style-note input is a one-shot + suffix. + +## Testing & verification + +**Automated:** +- Unit (Rust): swap-and-restore round-trip preserves stored transcript. +- Unit (Rust): amend mode persists override into new insight row. +- Unit (Rust): `fetch_messages_for_contact(days_radius=N)` produces a + window of `2N` days centered on `center_ts`. +- Unit (Rust): `build_tool_definitions(opts)` excludes gated tools when + the corresponding flag is false. + +**Manual:** +- Run a chat turn against an existing insight without `system_prompt` → + output unchanged from baseline. +- Same insight, with override → output reflects new voice. +- Re-open chat → original baked persona still authoritative (override + was ephemeral). +- Regenerate an insight with the journal persona → model's voice + matches journal style; no "memory assistant" framing leaks through. +- Toggle data presence (delete a row from `calendar_events`) → tool + drops from the catalog on the next turn. + +## Risks + +- **Default identity wording matters.** A too-neutral default ("Use the + gathered context to write a summary") might produce flatter output + than today's "personal photo memory assistant" framing for users + who never set a persona. Mitigation: tune the default with a small + set of test photos before merging. +- **Persona-suffix style notes can contradict persona voice.** A user + who picks `journal` (first person, warm) and adds the style note + "respond in bullet points" will get a tonal collision. Acceptable — + the user expressed a per-turn intent and we honor it. Document the + composition rule in the persona-manager UI. +- **Lazy data-presence probes add a per-turn `SELECT 1`.** Negligible + on SQLite (sub-millisecond) but adds up across many turns. Cache the + result for the turn's duration; don't re-probe per-tool. + +## Open questions + +None blocking. Items deferred to a possible follow-up cycle: + +- Apollo parity for the same per-turn override pattern (already + present; just needs RN client wiring on the photo path which is + already proxy). +- Tool consolidation (`search_messages` + `get_sms_messages` → + single `search_messages` with optional date filter, Apollo-style). + Considered and deferred — separate spec. diff --git a/src/ai/handlers.rs b/src/ai/handlers.rs index 7ec6ab7..f6a0a37 100644 --- a/src/ai/handlers.rs +++ b/src/ai/handlers.rs @@ -640,6 +640,10 @@ pub struct ChatTurnHttpRequest { pub min_p: Option, #[serde(default)] pub max_iterations: Option, + /// Per-turn system-prompt override. Ephemeral in append mode, + /// persisted in amend mode. See ChatTurnRequest for semantics. + #[serde(default)] + pub system_prompt: Option, #[serde(default)] pub amend: bool, } @@ -695,6 +699,7 @@ pub async fn chat_turn_handler( top_k: request.top_k, min_p: request.min_p, max_iterations: request.max_iterations, + system_prompt: request.system_prompt.clone(), amend: request.amend, }; @@ -909,6 +914,7 @@ pub async fn chat_stream_handler( top_k: request.top_k, min_p: request.min_p, max_iterations: request.max_iterations, + system_prompt: request.system_prompt.clone(), amend: request.amend, }; diff --git a/src/ai/insight_chat.rs b/src/ai/insight_chat.rs index 792f124..bbb03dd 100644 --- a/src/ai/insight_chat.rs +++ b/src/ai/insight_chat.rs @@ -45,6 +45,11 @@ pub struct ChatTurnRequest { pub top_k: Option, pub min_p: Option, pub max_iterations: Option, + /// Per-turn system-prompt override. In append mode (default), applied + /// ephemerally — original system message restored before persistence. + /// In amend mode, persisted into the new insight row's system message. + /// None / empty = no change. + pub system_prompt: Option, /// When true, write a new insight row (regenerating title) instead of /// updating training_messages on the existing row. pub amend: bool, @@ -359,10 +364,12 @@ impl InsightChatService { .map(|imgs| !imgs.is_empty()) .unwrap_or(false); let offer_describe_tool = !is_hybrid && local_first_user_has_image; - let tools = InsightGenerator::build_tool_definitions( - offer_describe_tool, - self.generator.apollo_enabled(), - ); + // current_gate_opts(has_vision) sets gate_opts.has_vision = has_vision + // and probes the per-table presence flags. Pass `offer_describe_tool` + // directly — the `!is_hybrid && local_first_user_has_image` decision + // is the chat-path's vision predicate. + let gate_opts = self.generator.current_gate_opts(offer_describe_tool); + let tools = InsightGenerator::build_tool_definitions(gate_opts); // Image base64 only needed when describe_photo is on the menu. Load // lazily to avoid disk IO when the loop never invokes it. @@ -385,6 +392,13 @@ impl InsightChatService { // 7. Append the new user turn. messages.push(ChatMessage::user(req.user_message.clone())); + // Apply per-turn system-prompt override BEFORE the budget annotation + // so the budget note attaches to the override, not the original. + // The stash is consumed below before persistence (append mode) or + // dropped (amend mode, where the override stays in place). + let override_stash = + apply_system_prompt_override(&mut messages, req.system_prompt.as_deref()); + // Temporarily annotate the system message with this turn's iteration // budget so the model knows how many tool-calling rounds it has. We // restore the original content before persistence so the note doesn't @@ -481,6 +495,14 @@ impl InsightChatService { // before we persist so it doesn't snowball on each subsequent turn. restore_system_content(&mut messages, original_system_content); + // Append mode: undo the per-turn system-prompt override so the + // stored transcript keeps the original baked persona. Amend mode: + // keep the override in place — it becomes the new insight row's + // system message. + if !req.amend { + restore_system_prompt_override(&mut messages, override_stash); + } + // 9. Persist. Append mode rewrites the JSON blob in place; amend // mode regenerates the title and inserts a new insight row, // relying on store_insight to flip prior rows' is_current=false. @@ -790,10 +812,12 @@ impl InsightChatService { .map(|imgs| !imgs.is_empty()) .unwrap_or(false); let offer_describe_tool = !is_hybrid && local_first_user_has_image; - let tools = InsightGenerator::build_tool_definitions( - offer_describe_tool, - self.generator.apollo_enabled(), - ); + // current_gate_opts(has_vision) sets gate_opts.has_vision = has_vision + // and probes the per-table presence flags. Pass `offer_describe_tool` + // directly — the `!is_hybrid && local_first_user_has_image` decision + // is the chat-path's vision predicate. + let gate_opts = self.generator.current_gate_opts(offer_describe_tool); + let tools = InsightGenerator::build_tool_definitions(gate_opts); let image_base64: Option = if offer_describe_tool { self.generator.load_image_as_base64(&normalized).ok() @@ -812,6 +836,10 @@ impl InsightChatService { messages.push(ChatMessage::user(req.user_message.clone())); + // Mirror chat_turn: per-turn override goes on first, budget note next. + let override_stash = + apply_system_prompt_override(&mut messages, req.system_prompt.as_deref()); + let original_system_content = annotate_system_with_budget(&mut messages, max_iterations); let mut tool_calls_made = 0usize; @@ -946,6 +974,13 @@ impl InsightChatService { // before we persist so it doesn't snowball on each subsequent turn. restore_system_content(&mut messages, original_system_content); + // Append mode: undo the per-turn system-prompt override (mirrors + // chat_turn). Amend mode: keep the override — it becomes the new + // insight row's system message. + if !req.amend { + restore_system_prompt_override(&mut messages, override_stash); + } + // Persist. let json = serde_json::to_string(&messages) .map_err(|e| anyhow!("failed to serialize chat history: {}", e))?; @@ -1153,6 +1188,64 @@ fn restore_system_content(messages: &mut [ChatMessage], original: Option } } +/// Receipt produced by [`apply_system_prompt_override`] so the caller can +/// undo the override before persistence. Two variants because we either +/// replaced an existing system message (need its original content) or +/// prepended a synthetic one (need to pop it). +#[derive(Debug)] +pub(crate) enum SystemPromptStash { + Replaced { original: String }, + Prepended, +} + +/// Apply a per-turn `system_prompt` override to `messages` so the model +/// sees the requested persona for this turn. Returns a stash the caller +/// must pass to [`restore_system_prompt_override`] before persisting the +/// transcript — without that step, append-mode chat would silently +/// rewrite the stored persona. +/// +/// No-op (returns `None`) when `override_prompt` is `None` or empty. +pub(crate) fn apply_system_prompt_override( + messages: &mut Vec, + override_prompt: Option<&str>, +) -> Option { + let prompt = override_prompt + .map(str::trim) + .filter(|s| !s.is_empty())? + .to_string(); + if let Some(first) = messages.first_mut() + && first.role == "system" + { + let original = std::mem::replace(&mut first.content, prompt); + return Some(SystemPromptStash::Replaced { original }); + } + messages.insert(0, ChatMessage::system(prompt)); + Some(SystemPromptStash::Prepended) +} + +/// Undo an override previously applied by [`apply_system_prompt_override`]. +/// No-op when `stash` is `None`. +pub(crate) fn restore_system_prompt_override( + messages: &mut Vec, + stash: Option, +) { + let Some(stash) = stash else { return }; + match stash { + SystemPromptStash::Replaced { original } => { + if let Some(first) = messages.first_mut() + && first.role == "system" + { + first.content = original; + } + } + SystemPromptStash::Prepended => { + if matches!(messages.first(), Some(m) if m.role == "system") { + messages.remove(0); + } + } + } +} + /// View returned to clients for chat-UI rendering. #[derive(Debug)] pub struct HistoryView { @@ -1386,4 +1479,94 @@ mod tests { let cut = find_raw_cut(&msgs, 2).expect("boundary cut should succeed"); assert_eq!(cut, msgs.len()); } + + #[test] + fn apply_override_replaces_existing_system_message() { + let mut msgs = vec![ + ChatMessage::system("original persona"), + ChatMessage::user("hi"), + ]; + let stash = apply_system_prompt_override(&mut msgs, Some("new persona")); + assert_eq!(msgs[0].content, "new persona"); + match stash { + Some(SystemPromptStash::Replaced { original }) => { + assert_eq!(original, "original persona"); + } + other => panic!("expected Replaced, got {:?}", other), + } + } + + #[test] + fn apply_override_prepends_synthetic_when_missing() { + let mut msgs = vec![ChatMessage::user("hi")]; + let stash = apply_system_prompt_override(&mut msgs, Some("new persona")); + assert_eq!(msgs.len(), 2); + assert_eq!(msgs[0].role, "system"); + assert_eq!(msgs[0].content, "new persona"); + assert!(matches!(stash, Some(SystemPromptStash::Prepended))); + } + + #[test] + fn apply_override_no_op_when_none() { + let mut msgs = vec![ChatMessage::system("sys"), ChatMessage::user("hi")]; + let stash = apply_system_prompt_override(&mut msgs, None); + assert!(stash.is_none()); + assert_eq!(msgs[0].content, "sys"); + } + + #[test] + fn apply_override_no_op_for_empty_string() { + let mut msgs = vec![ChatMessage::system("sys")]; + let stash = apply_system_prompt_override(&mut msgs, Some("")); + assert!(stash.is_none()); + assert_eq!(msgs[0].content, "sys"); + } + + #[test] + fn restore_override_replaces_back() { + let mut msgs = vec![ChatMessage::system("new"), ChatMessage::user("hi")]; + restore_system_prompt_override( + &mut msgs, + Some(SystemPromptStash::Replaced { + original: "original".to_string(), + }), + ); + assert_eq!(msgs[0].content, "original"); + assert_eq!(msgs.len(), 2); + } + + #[test] + fn restore_override_pops_synthetic() { + let mut msgs = vec![ChatMessage::system("new"), ChatMessage::user("hi")]; + restore_system_prompt_override(&mut msgs, Some(SystemPromptStash::Prepended)); + assert_eq!(msgs.len(), 1); + assert_eq!(msgs[0].role, "user"); + } + + #[test] + fn override_round_trip_preserves_original_system_message() { + let mut msgs = vec![ + ChatMessage::system("original persona"), + ChatMessage::user("first user"), + assistant_text("first reply"), + ]; + let stash = apply_system_prompt_override(&mut msgs, Some("ephemeral persona")); + assert_eq!(msgs[0].content, "ephemeral persona"); + restore_system_prompt_override(&mut msgs, stash); + assert_eq!(msgs[0].content, "original persona"); + assert_eq!(msgs.len(), 3); + assert_eq!(msgs[1].role, "user"); + assert_eq!(msgs[2].role, "assistant"); + } + + #[test] + fn override_with_synthetic_round_trip_drops_extra_message() { + let mut msgs = vec![ChatMessage::user("first user")]; + let stash = apply_system_prompt_override(&mut msgs, Some("ephemeral")); + assert_eq!(msgs.len(), 2); + assert_eq!(msgs[0].role, "system"); + restore_system_prompt_override(&mut msgs, stash); + assert_eq!(msgs.len(), 1); + assert_eq!(msgs[0].role, "user"); + } } diff --git a/src/ai/insight_generator.rs b/src/ai/insight_generator.rs index e03b1af..3e0d7fb 100644 --- a/src/ai/insight_generator.rs +++ b/src/ai/insight_generator.rs @@ -83,12 +83,29 @@ pub struct InsightGenerator { search_dao: Arc>>, tag_dao: Arc>>, + // Face detections (used by the get_faces_in_photo agentic tool) + face_dao: Arc>>, + // Knowledge memory knowledge_dao: Arc>>, libraries: Vec, } +/// Per-call gating flags for `build_tool_definitions`. Tools whose backing +/// data is empty (or whose env-var guard is unset) are dropped from the +/// catalog so the LLM doesn't reach for a tool that always returns "No +/// results found." — that wastes iteration budget. +#[derive(Debug, Clone, Copy, Default)] +pub struct ToolGateOpts { + pub has_vision: bool, + pub apollo_enabled: bool, + pub daily_summaries_present: bool, + pub calendar_present: bool, + pub location_history_present: bool, + pub faces_present: bool, +} + impl InsightGenerator { pub fn new( ollama: OllamaClient, @@ -102,6 +119,7 @@ impl InsightGenerator { location_dao: Arc>>, search_dao: Arc>>, tag_dao: Arc>>, + face_dao: Arc>>, knowledge_dao: Arc>>, libraries: Vec, ) -> Self { @@ -117,6 +135,7 @@ impl InsightGenerator { location_dao, search_dao, tag_dao, + face_dao, knowledge_dao, libraries, } @@ -130,6 +149,51 @@ impl InsightGenerator { self.apollo_client.is_enabled() } + /// Compute the per-call tool gate options by probing each backing + /// table for presence. `daily_summaries_present` uses a `LIMIT 1` + /// existence probe; `calendar_present` and `location_history_present` + /// use the existing `get_event_count` / `get_location_count` + /// methods (small enough that a full `COUNT(*)` is fine). Meant to + /// be called once per chat turn / generation. `has_vision` is + /// supplied by the caller because it depends on the model selected + /// for this turn, not on persistent state. + pub fn current_gate_opts(&self, has_vision: bool) -> ToolGateOpts { + let cx = opentelemetry::Context::new(); + let calendar_present = { + let mut dao = self + .calendar_dao + .lock() + .expect("Unable to lock CalendarEventDao"); + dao.get_event_count(&cx).map(|n| n > 0).unwrap_or(false) + }; + let location_history_present = { + let mut dao = self + .location_dao + .lock() + .expect("Unable to lock LocationHistoryDao"); + dao.get_location_count(&cx).map(|n| n > 0).unwrap_or(false) + }; + let daily_summaries_present = { + let mut dao = self + .daily_summary_dao + .lock() + .expect("Unable to lock DailySummaryDao"); + dao.has_any_summaries(&cx).unwrap_or(false) + }; + let faces_present = { + let mut dao = self.face_dao.lock().expect("Unable to lock FaceDao"); + dao.has_any_faces(&cx).unwrap_or(false) + }; + ToolGateOpts { + has_vision, + apollo_enabled: self.apollo_enabled(), + daily_summaries_present, + calendar_present, + location_history_present, + faces_present, + } + } + /// Resolve `rel_path` against the configured libraries, returning the /// first root under which the file exists. Insights may be generated /// for any library — the generator itself doesn't know which — so we @@ -983,7 +1047,7 @@ impl InsightGenerator { // Step 1: Get FULL immediate temporal context (±4 days, ALL messages) let immediate_messages = self .sms_client - .fetch_messages_for_contact(contact.as_deref(), timestamp) + .fetch_messages_for_contact(contact.as_deref(), timestamp, 4) .await .unwrap_or_else(|e| { log::error!("Failed to fetch immediate messages: {}", e); @@ -1129,7 +1193,7 @@ impl InsightGenerator { log::info!("Using traditional time-based message retrieval (±4 days)"); let sms_messages = self .sms_client - .fetch_messages_for_contact(contact.as_deref(), timestamp) + .fetch_messages_for_contact(contact.as_deref(), timestamp, 4) .await .unwrap_or_else(|e| { log::error!("Failed to fetch SMS messages: {}", e); @@ -1476,6 +1540,7 @@ Return ONLY the summary, nothing else."#, "get_calendar_events" => self.tool_get_calendar_events(arguments, cx).await, "get_location_history" => self.tool_get_location_history(arguments, cx).await, "get_file_tags" => self.tool_get_file_tags(arguments, cx).await, + "get_faces_in_photo" => self.tool_get_faces_in_photo(arguments, cx).await, "describe_photo" => self.tool_describe_photo(ollama, image_base64).await, "reverse_geocode" => self.tool_reverse_geocode(arguments).await, "get_personal_place_at" => self.tool_get_personal_place_at(arguments).await, @@ -1711,24 +1776,20 @@ Return ONLY the summary, nothing else."#, } /// Tool: search_messages — keyword / semantic / hybrid search over all - /// SMS message bodies via the Django FTS5 + embeddings pipeline. Unlike - /// `search_rag` (daily summaries, date-weighted) this hits raw message - /// text across time and is the right choice for exact phrases, proper - /// nouns, URLs, or anything where specific wording matters. + /// SMS message bodies via the Django FTS5 + embeddings pipeline. Now + /// supports optional `contact_id`, `start_ts`, `end_ts` filters. async fn tool_search_messages(&self, args: &serde_json::Value) -> String { let query = match args.get("query").and_then(|v| v.as_str()) { Some(q) if !q.trim().is_empty() => q.trim(), _ => { - // Redirect when the model reached for this tool with a - // date/contact-shaped intent — get_sms_messages is the right - // call. Without this hint, small models often just retry - // search_messages again with the same args. - let has_date = args.get("date").is_some(); - let has_contact = args.get("contact").is_some(); + let has_date = args.get("date").is_some() + || args.get("start_ts").is_some() + || args.get("end_ts").is_some(); + let has_contact = args.get("contact").is_some() || args.get("contact_id").is_some(); if has_date || has_contact { return "Error: search_messages needs a 'query' (keywords/phrase). \ - To fetch messages around a date or from a contact, call \ - get_sms_messages with { date, contact? } instead." + To fetch messages around a date or from a contact without keywords, \ + call get_sms_messages with { date, contact? } instead." .to_string(); } return "Error: missing required parameter 'query'".to_string(); @@ -1748,51 +1809,96 @@ Return ONLY the summary, nothing else."#, mode ); } - let limit = args + let user_limit = args .get("limit") .and_then(|v| v.as_i64()) .unwrap_or(20) .clamp(1, 50) as usize; + let contact_id = args.get("contact_id").and_then(|v| v.as_i64()); + let start_ts = args.get("start_ts").and_then(|v| v.as_i64()); + let end_ts = args.get("end_ts").and_then(|v| v.as_i64()); + let has_date_filter = start_ts.is_some() || end_ts.is_some(); + + // When a date filter is supplied, fetch a larger pool from SMS-API + // so in-window matches that ranked lower than out-of-window ones + // aren't lost. + let fetch_limit = if has_date_filter { 100 } else { user_limit }; log::info!( - "tool_search_messages: query='{}', mode={}, limit={}", + "tool_search_messages: query='{}', mode={}, contact_id={:?}, range=[{:?}, {:?}], user_limit={}, fetch_limit={}", query, mode, - limit + contact_id, + start_ts, + end_ts, + user_limit, + fetch_limit ); - match self.sms_client.search_messages(query, &mode, limit).await { - Ok(hits) if hits.is_empty() => "No messages matched.".to_string(), - Ok(hits) => { - let mut out = String::new(); - out.push_str(&format!( - "Found {} messages (mode: {}):\n\n", - hits.len(), - mode - )); - let user_name = user_display_name(); - for h in hits { - let date = chrono::DateTime::from_timestamp(h.date, 0) - .map(|dt| dt.format("%Y-%m-%d").to_string()) - .unwrap_or_else(|| h.date.to_string()); - let direction: &str = if h.type_ == 2 { - &user_name - } else { - &h.contact_name - }; - let score = h - .similarity_score - .map(|s| format!(" [score {:.2}]", s)) - .unwrap_or_default(); - out.push_str(&format!( - "[{}]{} {} — {}\n\n", - date, score, direction, h.body - )); + let hits = match self + .sms_client + .search_messages_with_contact(query, &mode, fetch_limit, contact_id) + .await + { + Ok(h) => h, + Err(e) => return format!("Error searching messages: {}", e), + }; + + // Date-range post-filter on the client side. SMS-API's /search/ + // doesn't accept date params; mirroring Apollo's pattern here. + let filtered: Vec<_> = hits + .into_iter() + .filter(|h| { + if let Some(s) = start_ts + && h.date < s + { + return false; } - out - } - Err(e) => format!("Error searching messages: {}", e), + if let Some(e) = end_ts + && h.date > e + { + return false; + } + true + }) + .take(user_limit) + .collect(); + + if filtered.is_empty() { + return "No messages matched.".to_string(); } + + let user_name = user_display_name(); + let mut out = String::new(); + out.push_str(&format!( + "Found {} messages (mode: {}{}):\n\n", + filtered.len(), + mode, + if has_date_filter { + ", date-filtered" + } else { + "" + } + )); + for h in filtered { + let date = chrono::DateTime::from_timestamp(h.date, 0) + .map(|dt| dt.format("%Y-%m-%d").to_string()) + .unwrap_or_else(|| h.date.to_string()); + let direction: &str = if h.type_ == 2 { + &user_name + } else { + &h.contact_name + }; + let score = h + .similarity_score + .map(|s| format!(" [score {:.2}]", s)) + .unwrap_or_default(); + out.push_str(&format!( + "[{}]{} {} — {}\n\n", + date, score, direction, h.body + )); + } + out } /// Tool: get_sms_messages — fetch SMS messages near a date for a contact @@ -1812,7 +1918,8 @@ Return ONLY the summary, nothing else."#, let days_radius = args .get("days_radius") .and_then(|v| v.as_i64()) - .unwrap_or(4); + .unwrap_or(4) + .clamp(1, 30); let limit = args .get("limit") .and_then(|v| v.as_i64()) @@ -1835,7 +1942,7 @@ Return ONLY the summary, nothing else."#, match self .sms_client - .fetch_messages_for_contact(contact.as_deref(), timestamp) + .fetch_messages_for_contact(contact.as_deref(), timestamp, days_radius) .await { Ok(messages) if !messages.is_empty() => { @@ -2054,6 +2161,82 @@ Return ONLY the summary, nothing else."#, } } + /// Tool: get_faces_in_photo — list face detections + person names for + /// the given file path. Resolves rel_path → content_hash via FaceDao, + /// then queries face_detections joined with persons (status='detected' + /// only). Returns a compact bullet list keyed for human-LLM readability. + async fn tool_get_faces_in_photo( + &self, + args: &serde_json::Value, + cx: &opentelemetry::Context, + ) -> String { + let file_path = match args.get("file_path").and_then(|v| v.as_str()) { + Some(p) if !p.trim().is_empty() => p.trim().to_string(), + _ => return "Error: missing required parameter 'file_path'".to_string(), + }; + log::info!("tool_get_faces_in_photo: file_path='{}'", file_path); + + // Resolve content_hash from any library that has this rel_path. + // Hold the FaceDao lock once across all libraries — resolve_content_hash + // is synchronous and there's no await in the loop body. + let content_hash: Option = { + let mut dao = self.face_dao.lock().expect("Unable to lock FaceDao"); + self.libraries.iter().find_map(|lib| { + dao.resolve_content_hash(cx, lib.id, &file_path) + .ok() + .flatten() + }) + }; + let Some(content_hash) = content_hash else { + return "No content_hash found for that file path (the photo may not be indexed yet, \ + or the path doesn't match any library)." + .to_string(); + }; + + let faces = { + let mut dao = self.face_dao.lock().expect("Unable to lock FaceDao"); + match dao.list_for_content_hash(cx, &content_hash) { + Ok(rows) => rows, + Err(e) => return format!("Error querying faces: {}", e), + } + }; + + if faces.is_empty() { + return "No faces detected in this photo.".to_string(); + } + + // Render: bound faces grouped by person first, then unbound. The + // model uses the bound names directly; the unbound count + bbox + // helps it count people without naming them. + let bound: Vec<&_> = faces.iter().filter(|f| f.person_name.is_some()).collect(); + let unbound: Vec<&_> = faces.iter().filter(|f| f.person_name.is_none()).collect(); + + let mut out = format!("Found {} face(s) in this photo:\n", faces.len()); + for f in &bound { + // Invariant: `bound` is filtered on `person_name.is_some()` above. + let name = f + .person_name + .as_deref() + .expect("bound face must have a name"); + out.push_str(&format!( + "- {} (confidence {:.2}, bbox x={:.2} y={:.2} w={:.2} h={:.2}, source: {})\n", + name, f.confidence, f.bbox_x, f.bbox_y, f.bbox_w, f.bbox_h, f.source, + )); + } + for f in &unbound { + out.push_str(&format!( + "- (unidentified) confidence {:.2}, bbox x={:.2} y={:.2} w={:.2} h={:.2}, source: {}\n", + f.confidence, + f.bbox_x, + f.bbox_y, + f.bbox_w, + f.bbox_h, + f.source, + )); + } + out + } + /// Tool: describe_photo — generate a visual description of the photo async fn tool_describe_photo( &self, @@ -2484,283 +2667,257 @@ Return ONLY the summary, nothing else."#, // ── Agentic insight generation ────────────────────────────────────── - /// Build the list of tool definitions for the agentic loop - pub(crate) fn build_tool_definitions(has_vision: bool, apollo_enabled: bool) -> Vec { - let mut tools = vec![ - Tool::function( + /// Build the list of tool definitions for the agentic loop, gated by + /// `opts`. Always-on tools: `search_messages`, `get_sms_messages`, + /// `get_file_tags`, `reverse_geocode`, `get_current_datetime`, the + /// four knowledge-memory tools. Conditional: `describe_photo` (vision + /// model), `get_personal_place_at` (Apollo configured), `search_rag` + /// (daily_summaries populated), `get_calendar_events` (calendar + /// populated), `get_location_history` (location history populated). + pub(crate) fn build_tool_definitions(opts: ToolGateOpts) -> Vec { + let mut tools: Vec = Vec::new(); + + if opts.daily_summaries_present { + tools.push(Tool::function( "search_rag", - "Search conversation history using semantic search. Use this to find relevant past conversations about specific topics, people, or events.", + "Date-anchored semantic search over the user's daily-summary corpus. \ + Returns up to `limit` summaries most semantically similar to `query`, \ + weighted toward summaries near `date`. For raw message text across all \ + time, prefer `search_messages`. \ + Examples: `{query: \"family dinner\", date: \"2018-12-24\"}` — what \ + daily summaries near Christmas Eve mention family / dinner / gathering. \ + `{query: \"work travel\", date: \"2019-06-15\", contact: \"Alice\"}` — \ + narrowed to summaries that involve Alice.", serde_json::json!({ "type": "object", "required": ["query", "date"], "properties": { - "query": { - "type": "string", - "description": "The search query to find relevant conversations" - }, - "date": { - "type": "string", - "description": "The reference date in YYYY-MM-DD format" - }, - "contact": { - "type": "string", - "description": "Optional contact name to filter results" - }, - "limit": { - "type": "integer", - "description": "Maximum number of results to return (default: 10, max: 25)" - } - } - }), - ), - Tool::function( - "search_messages", - "CONTENT search over SMS message bodies by keywords/phrases/topics across all time. Use when you're looking for specific wording (phrases, proper nouns, URLs, topics) and DON'T have a date in mind. NOT for time-based queries — if you know the date or want messages around a date, call get_sms_messages instead. Modes: 'fts5' (keyword, supports \"phrase\" / prefix* / AND / NEAR(w1 w2, 5)), 'semantic' (embedding similarity), 'hybrid' (recommended — merges both via reciprocal rank fusion).", - serde_json::json!({ - "type": "object", - "required": ["query"], - "properties": { - "query": { - "type": "string", - "description": "Search query. Min 3 chars. For fts5 mode, supports phrase (\"\"), prefix (*), AND/OR/NOT, and NEAR proximity." - }, - "mode": { - "type": "string", - "enum": ["fts5", "semantic", "hybrid"], - "description": "Search strategy. Default: hybrid." - }, - "limit": { - "type": "integer", - "description": "Maximum number of results (default: 20, max: 50)" - } - } - }), - ), - Tool::function( - "get_sms_messages", - "TIME-BASED fetch of SMS/text messages around a specific date (and optionally from a specific contact). Returns the actual message conversation for that window. Use this whenever you know the date or want the context around a photo's timestamp. Omit contact to search across all conversations. For keyword/topic search without a date, use search_messages instead.", - serde_json::json!({ - "type": "object", - "required": ["date"], - "properties": { - "date": { - "type": "string", - "description": "The center date in YYYY-MM-DD format" - }, - "contact": { - "type": "string", - "description": "Optional contact name to filter messages. If omitted, searches all conversations." - }, - "days_radius": { - "type": "integer", - "description": "Number of days before and after the date to search (default: 4)" - }, - "limit": { - "type": "integer", - "description": "Maximum number of messages to return (default: 60, max: 150)" - } - } - }), - ), - Tool::function( - "get_calendar_events", - "Fetch calendar events near a specific date. Shows scheduled events, meetings, and activities.", - serde_json::json!({ - "type": "object", - "required": ["date"], - "properties": { - "date": { - "type": "string", - "description": "The center date in YYYY-MM-DD format" - }, - "days_radius": { - "type": "integer", - "description": "Number of days before and after the date to search (default: 7)" - }, - "limit": { - "type": "integer", - "description": "Maximum number of events to return (default: 20, max: 50)" - } - } - }), - ), - Tool::function( - "get_location_history", - "Fetch location history records near a specific date. Shows places visited and activities.", - serde_json::json!({ - "type": "object", - "required": ["date"], - "properties": { - "date": { - "type": "string", - "description": "The center date in YYYY-MM-DD format" - }, - "days_radius": { - "type": "integer", - "description": "Number of days before and after the date to search (default: 14)" - } - } - }), - ), - Tool::function( - "get_file_tags", - "Get tags/labels that have been applied to a specific photo file.", - serde_json::json!({ - "type": "object", - "required": ["file_path"], - "properties": { - "file_path": { - "type": "string", - "description": "The file path of the photo to get tags for" - } - } - }), - ), - ]; - - tools.push(Tool::function( - "reverse_geocode", - "Convert GPS latitude/longitude coordinates to a human-readable place name (city, state). Use this when GPS coordinates are available in the photo metadata, or to resolve coordinates returned by get_location_history.", - serde_json::json!({ - "type": "object", - "required": ["latitude", "longitude"], - "properties": { - "latitude": { - "type": "number", - "description": "GPS latitude in decimal degrees" - }, - "longitude": { - "type": "number", - "description": "GPS longitude in decimal degrees" - } - } - }), - )); - - // Personal place lookup. Only registered when the integration is - // enabled — otherwise the LLM gets a tool that always errors. - if apollo_enabled { - tools.push(Tool::function( - "get_personal_place_at", - "Get the user's personal, named place (e.g. Home, Work, Cabin) at a GPS coordinate, if any. Returns the place name, category, free-text description (the user's own notes about the location), and radius. More specific than reverse_geocode — prefer this when both apply.", - serde_json::json!({ - "type": "object", - "required": ["latitude", "longitude"], - "properties": { - "latitude": { "type": "number", "description": "GPS latitude in decimal degrees" }, - "longitude": { "type": "number", "description": "GPS longitude in decimal degrees" } + "query": { "type": "string", "description": "Free-text query, semantically matched." }, + "date": { "type": "string", "description": "Anchor date, YYYY-MM-DD. Summaries near this date rank higher." }, + "contact": { "type": "string", "description": "Optional contact name to bias toward conversations with that person." }, + "limit": { "type": "integer", "description": "Max summaries to return (default 10, max 25)." } + } + }), + )); + } + + tools.push(Tool::function( + "search_messages", + "Search SMS/MMS message bodies. Modes: `fts5` (keyword + phrase + prefix + AND/OR/NOT + NEAR proximity), \ + `semantic` (embedding similarity, requires generated embeddings), `hybrid` (RRF merge, recommended; \ + degrades to fts5 when embeddings absent). Optional `start_ts` / `end_ts` (real-UTC unix seconds) and \ + `contact_id` filters. For pure date / contact browsing without keywords, prefer `get_sms_messages`. \ + Examples: `{query: \"trader joe's\"}` — phrase across all time. \ + `{query: \"dinner\", contact_id: 42, start_ts: 1700000000, end_ts: 1700604800}` — keyword within a contact and a week. \ + `{query: \"NEAR(meeting work, 5)\"}` — proximity search.", + serde_json::json!({ + "type": "object", + "required": ["query"], + "properties": { + "query": { "type": "string", "description": "Search query. Min 3 chars. fts5 supports phrase (\"\"), prefix (*), AND/OR/NOT, and NEAR proximity." }, + "mode": { "type": "string", "enum": ["fts5", "semantic", "hybrid"], "description": "Search strategy. Default: hybrid." }, + "limit": { "type": "integer", "description": "Max results (default 20, max 50)." }, + "contact_id": { "type": "integer", "description": "Optional numeric contact id to scope the search." }, + "start_ts": { "type": "integer", "description": "Optional inclusive lower bound, real-UTC unix seconds." }, + "end_ts": { "type": "integer", "description": "Optional inclusive upper bound, real-UTC unix seconds." } + } + }), + )); + + tools.push(Tool::function( + "get_sms_messages", + "Fetch SMS/MMS messages near a date (and optionally from a specific contact). Use when you know the date \ + or want context around a photo's timestamp. For keyword search without a date, use `search_messages`. \ + Returns up to `limit` messages within `±days_radius` of `date`, sorted by proximity. \ + Example: `{date: \"2018-08-12\", contact: \"Mom\", days_radius: 2}` — messages from Mom within ±2 days of Aug 12 2018.", + serde_json::json!({ + "type": "object", + "required": ["date"], + "properties": { + "date": { "type": "string", "description": "Center date, YYYY-MM-DD." }, + "contact": { "type": "string", "description": "Optional contact name (case-insensitive). Falls back to all contacts on no match." }, + "days_radius": { "type": "integer", "description": "Days before and after to include (default 4)." }, + "limit": { "type": "integer", "description": "Max messages to return (default 60, max 150)." } + } + }), + )); + + if opts.calendar_present { + tools.push(Tool::function( + "get_calendar_events", + "Fetch calendar events near a date — meetings, scheduled activities, all-day events. \ + Returns events within `±days_radius` of `date`. \ + Example: `{date: \"2019-03-22\", days_radius: 3}` — events within a week of March 22 2019.", + serde_json::json!({ + "type": "object", + "required": ["date"], + "properties": { + "date": { "type": "string", "description": "Center date, YYYY-MM-DD." }, + "days_radius": { "type": "integer", "description": "Days before and after to include (default 7)." }, + "limit": { "type": "integer", "description": "Max events to return (default 20, max 50)." } + } + }), + )); + } + + if opts.location_history_present { + tools.push(Tool::function( + "get_location_history", + "Fetch raw location records (lat/lon/timestamp/activity) near a date. The default 14-day radius is \ + wide because location density varies; tighten to ±1 day for a single-trip query. For a coordinate's \ + named place, use `reverse_geocode` (or `get_personal_place_at` when Apollo is enabled).", + serde_json::json!({ + "type": "object", + "required": ["date"], + "properties": { + "date": { "type": "string", "description": "Center date, YYYY-MM-DD." }, + "days_radius": { "type": "integer", "description": "Days before and after to include (default 14)." } + } + }), + )); + } + + tools.push(Tool::function( + "get_file_tags", + "Get user-applied tags for a specific photo file path. Tags are user-curated, not auto-detected.", + serde_json::json!({ + "type": "object", + "required": ["file_path"], + "properties": { + "file_path": { "type": "string", "description": "File path of the photo." } + } + }), + )); + + tools.push(Tool::function( + "reverse_geocode", + "Convert GPS lat/lon to a human-readable place name (city, state). Use for any coordinate the LLM has \ + obtained from EXIF or `get_location_history`. When Apollo is configured, prefer `get_personal_place_at` \ + — it returns the user's named places (Home / Work / etc.) which are more specific.", + serde_json::json!({ + "type": "object", + "required": ["latitude", "longitude"], + "properties": { + "latitude": { "type": "number", "description": "Decimal degrees." }, + "longitude": { "type": "number", "description": "Decimal degrees." } + } + }), + )); + + if opts.apollo_enabled { + tools.push(Tool::function( + "get_personal_place_at", + "Return any of the user's named Places (e.g. Home, Work, Cabin) whose radius contains (latitude, longitude). \ + Smallest radius first — most specific match wins. More specific than `reverse_geocode`; prefer this when \ + both apply. Returns place name, category, free-text description, and radius.", + serde_json::json!({ + "type": "object", + "required": ["latitude", "longitude"], + "properties": { + "latitude": { "type": "number", "description": "Decimal degrees." }, + "longitude": { "type": "number", "description": "Decimal degrees." } + } + }), + )); + } + + if opts.faces_present { + tools.push(Tool::function( + "get_faces_in_photo", + "Return the faces detected in this photo with their bounding boxes and assigned person names \ + (when bound). Each face carries `person_name` (string or null), `bbox` ({x, y, w, h} normalized 0–1), \ + `confidence` (0–1), and `source` ('auto' from detector or 'manual' from a user-drawn bbox). \ + More authoritative than `get_file_tags` for counting people in a photo or naming who is present, \ + since it returns detected-but-unbound faces too. \ + Example: `{file_path: \"2019/06/IMG_4242.jpg\"}`.", + serde_json::json!({ + "type": "object", + "required": ["file_path"], + "properties": { + "file_path": { "type": "string", "description": "File path of the photo." } } }), )); } - // Knowledge memory tools tools.push(Tool::function( "recall_entities", - "Search the knowledge memory for people, places, events, or things previously learned from other photos. Use this to retrieve context about subjects appearing in this photo.", + "Search the persistent knowledge memory for previously learned people, places, events, or things. \ + Use BEFORE writing the insight to ground the model on what's already known.", serde_json::json!({ "type": "object", "properties": { - "name": { - "type": "string", - "description": "Name or partial name to search for (case-insensitive substring match)" - }, - "entity_type": { - "type": "string", - "enum": ["person", "place", "event", "thing"], - "description": "Filter by entity type (optional)" - }, - "limit": { - "type": "integer", - "description": "Maximum number of results to return (default: 20, max: 50)" - } + "name": { "type": "string", "description": "Name or partial name (case-insensitive substring match)." }, + "entity_type": { "type": "string", "enum": ["person", "place", "event", "thing"] }, + "limit": { "type": "integer", "description": "Max results (default 20, max 50)." } } }), )); tools.push(Tool::function( "recall_facts_for_photo", - "Retrieve all known facts linked to a specific photo from the knowledge memory. Use this at the start of insight generation to load any previously stored knowledge about subjects in this photo.", + "Retrieve all stored facts linked to a specific photo. Call at the start of insight generation to load \ + prior knowledge about subjects in this photo without scanning the whole knowledge base.", serde_json::json!({ "type": "object", "required": ["file_path"], "properties": { - "file_path": { - "type": "string", - "description": "The file path of the photo to retrieve facts for" - } + "file_path": { "type": "string", "description": "File path of the photo." } } }), )); tools.push(Tool::function( "store_entity", - "Store or update a person, place, event, or thing in the knowledge memory. Call this when you identify a subject in this photo that should be remembered for future insights.", + "Upsert a person / place / event / thing into the knowledge memory. Returns the entity id (use it as \ + `subject_entity_id` or `object_entity_id` in `store_fact`). Idempotent on canonical name.", serde_json::json!({ "type": "object", "required": ["name", "entity_type"], "properties": { - "name": { - "type": "string", - "description": "The canonical name of the entity (e.g. 'John Smith', 'Banff National Park')" - }, - "entity_type": { - "type": "string", - "enum": ["person", "place", "event", "thing"], - "description": "The type of entity" - }, - "description": { - "type": "string", - "description": "A brief description of the entity" - } + "name": { "type": "string", "description": "Canonical name (e.g. \"John Smith\", \"Banff National Park\")." }, + "entity_type": { "type": "string", "enum": ["person", "place", "event", "thing"] }, + "description": { "type": "string", "description": "Brief description." } } }), )); tools.push(Tool::function( "store_fact", - "Record a fact about an entity in the knowledge memory. Provide EITHER object_entity_id (when the object is a known entity whose ID you have) OR object_value (for free-text attributes). The fact will be linked to the current photo automatically.", + "Record a fact about an entity in the knowledge memory. Always linked to the current photo. \ + You must provide EITHER `object_entity_id` (when the object is itself a stored entity — e.g. \ + person A is_friend_of person B) OR `object_value` (free-text attribute — e.g. role=\"software engineer\"). \ + `object_entity_id` takes precedence when both are present. \ + Examples: \ + `{subject_entity_id: 7, predicate: \"is_friend_of\", object_entity_id: 12}` — links two known entities. \ + `{subject_entity_id: 7, predicate: \"lives_in\", object_value: \"Portland, Oregon\"}` — free-text attribute.", serde_json::json!({ "type": "object", "required": ["subject_entity_id", "predicate"], "properties": { - "subject_entity_id": { - "type": "integer", - "description": "The ID of the entity this fact is about (returned by store_entity or recall_entities)" - }, - "predicate": { - "type": "string", - "description": "The relationship or attribute (e.g. 'is_friend_of', 'located_in', 'attended_event', 'is_sibling_of')" - }, - "object_entity_id": { - "type": "integer", - "description": "Use when the object is a known entity (e.g. another person's entity ID for 'is_friend_of '). Takes precedence over object_value." - }, - "object_value": { - "type": "string", - "description": "Use for free-text attributes where the object is not a stored entity (e.g. 'Portland, Oregon', 'software engineer')" - }, - "photo_role": { - "type": "string", - "description": "How this entity appears in the photo (e.g. 'subject', 'background', 'location'). Defaults to 'subject'." - } + "subject_entity_id": { "type": "integer", "description": "Entity id this fact is about." }, + "predicate": { "type": "string", "description": "Relationship or attribute (e.g. is_friend_of, located_in, attended_event)." }, + "object_entity_id": { "type": "integer", "description": "Use when the object is itself a stored entity. Takes precedence over object_value." }, + "object_value": { "type": "string", "description": "Use for free-text attributes where the object is not a stored entity." }, + "photo_role": { "type": "string", "description": "How this entity appears in the photo (default \"subject\")." } } }), )); tools.push(Tool::function( "get_current_datetime", - "Get the current date and time. Useful for understanding how long ago the photo was taken.", + "Get the current date and time. Useful when reasoning about how long ago a photo was taken.", serde_json::json!({ "type": "object", "properties": {} }), )); - if has_vision { + if opts.has_vision { tools.push(Tool::function( "describe_photo", - "Generate a visual description of the photo. Describes people, location, and activity visible in the image.", + "Generate a visual description of the current photo — people, location, objects, activity visible \ + in the image. Only available with vision-capable models.", serde_json::json!({ "type": "object", "properties": {} @@ -2943,6 +3100,57 @@ Return ONLY the summary, nothing else."#, } } + /// Assemble the chat system prompt from two named blocks: + /// + /// 1. **Identity / voice / format** — `custom_system_prompt` verbatim + /// when supplied, or a neutral default that doesn't fight a future + /// persona. The framework never asserts an identity that could + /// contradict the persona. + /// 2. **Procedural scaffolding** — tool-use guidance, iteration budget, + /// contact-filter rule. Identity-free; never asserts voice or shape. + /// + /// `owner_id_note` and `fewshot_block` are pre-rendered strings (they + /// already encode their own headers / blank lines). Pass empty / None + /// to skip. + pub(crate) fn build_system_content( + custom_system_prompt: Option<&str>, + owner_id_note: Option<&str>, + fewshot_block: &str, + max_iterations: usize, + ) -> String { + let identity = match custom_system_prompt { + Some(s) if !s.trim().is_empty() => s.trim().to_string(), + _ => String::from( + "You are reconstructing a memory from a photo. Use the gathered \ + context to write a thoughtful summary; you decide voice, length, and shape.", + ), + }; + + let owner = owner_id_note.unwrap_or(""); + + let procedural = format!( + "Tool-use guidance:\n\ + - You have a budget of {max_iterations} tool-calling iterations.\n\ + - Call tools to gather context BEFORE writing your final answer; don't answer after one or two calls.\n\ + - When calling get_sms_messages or search_rag, make at least one call WITHOUT a contact filter \ + — surrounding events matter even when a contact is known.\n\ + - Use recall_facts_for_photo + recall_entities to load any prior knowledge about subjects in the photo.\n\ + - When you identify people / places / events / things, use store_entity + store_fact to grow the persistent memory.\n\ + - A tool returning no results is informative; continue with the others.", + ); + + let mut out = identity; + if !owner.is_empty() { + out.push_str(owner); + } + out.push_str("\n\n"); + if !fewshot_block.is_empty() { + out.push_str(fewshot_block); + } + out.push_str(&procedural); + out + } + pub async fn generate_agentic_insight_for_photo( &self, file_path: &str, @@ -3288,9 +3496,12 @@ Return ONLY the summary, nothing else."#, None }; - // 8. Build system message - let owner_id_note = match owner_entity_id { - Some(id) => format!( + // 8. Build system message via the two-block helper. Custom prompt + // (when supplied) is the authoritative identity — the framework + // never appends a competing "you are a personal photo memory + // assistant" line. The procedural block stays identity-free. + let owner_id_note = owner_entity_id.map(|id| { + format!( "\n\nYour identity in the knowledge store: {name} (entity ID: {id}). \ When storing facts where you ({name}) are the object — for example, someone is your friend, \ sibling, or colleague — use subject_entity_id for the other person and set object_value to \ @@ -3298,32 +3509,15 @@ Return ONLY the summary, nothing else."#, {name} directly, use {id} as the subject_entity_id.", name = owner_name, id = id - ), - None => String::new(), - }; + ) + }); let fewshot_block = Self::render_fewshot_examples(&fewshot_examples); - let base_system = format!( - "You are a personal photo memory assistant helping to reconstruct a memory from a photo.{owner_id_note}\n\n\ - {fewshot_block}\ - IMPORTANT INSTRUCTIONS:\n\ - 1. You MUST call multiple tools to gather context BEFORE writing any final insight. Do not produce a final answer after only one or two tool calls.\n\ - 2. When calling get_sms_messages and search_rag, always make at least one call WITHOUT a contact filter to capture what else was happening in {owner_name}'s life around this date — other conversations, events, and activities provide important wider context even when a specific contact is known.\n\ - 3. Use recall_facts_for_photo to load any previously stored knowledge about subjects in this photo.\n\ - 4. Use recall_entities to look up known people, places, or things that appear in this photo.\n\ - 5. When you identify people, places, events, or notable things in this photo: use store_entity to record them and store_fact to record key facts (relationships, roles, attributes). This builds a persistent memory for future insights.\n\ - 6. Only produce your final insight AFTER you have gathered context from at least 5 tool calls.\n\ - 7. If a tool returns no results, that is useful information — continue calling the remaining tools anyway.\n\ - 8. You have a hard budget of {max_iterations} tool-calling iterations before the loop ends. Plan your context gathering so you can write a complete final insight within that budget.", - owner_id_note = owner_id_note, - fewshot_block = fewshot_block, - owner_name = owner_name, - max_iterations = max_iterations + let system_content = Self::build_system_content( + custom_system_prompt.as_deref(), + owner_id_note.as_deref(), + &fewshot_block, + max_iterations, ); - let system_content = if let Some(ref custom) = custom_system_prompt { - format!("{}\n\n{}", custom, base_system) - } else { - base_system.to_string() - }; // 9. Build user message let gps_info = exif @@ -3353,28 +3547,26 @@ Return ONLY the summary, nothing else."#, .map(|d| format!("Visual description (from local vision model):\n{}\n\n", d)) .unwrap_or_default(); + // Context-only payload — no output-shape prescription. The persona / + // custom_system_prompt owns voice, length, and structure. The "title + // and summary" claim that used to live here was unused (the title is + // regenerated post-hoc from the summary by generate_photo_title). let user_content = format!( - "{visual_block}Please analyze this photo and gather any relevant context from the surrounding weeks.\n\n\ - Photo file path: {}\n\ - Date taken: {}\n\ - {}\n\ - {}\n\ - {}\n\n\ - Use the available tools to gather more context about this moment (messages, calendar events, location history, etc.), \ - then write a detailed insight with a title and summary.", - file_path, - date_taken.format("%B %d, %Y"), - contact_info, - gps_info, - tags_info, - visual_block = visual_block, + "{visual_block}Photo file path: {file_path}\n\ + Date taken: {date}\n\ + {contact_info}\n\ + {gps_info}\n\ + {tags_info}\n\n\ + Gather context with the available tools, then respond.", + date = date_taken.format("%B %d, %Y"), ); - // 10. Define tools. Hybrid mode omits `describe_photo` since the - // chat model receives the visual description inline. - let offer_describe_tool = has_vision && !is_hybrid; - let tools = - Self::build_tool_definitions(offer_describe_tool, self.apollo_client.is_enabled()); + // 10. Define tools. Gate flags computed from current data presence; + // hybrid mode omits describe_photo since the chat model receives + // the visual description inline (so we pass `false` for has_vision + // in hybrid mode regardless of the model's actual capability). + let gate_opts = self.current_gate_opts(has_vision && !is_hybrid); + let tools = Self::build_tool_definitions(gate_opts); // 11. Build initial messages. In hybrid mode images are never // attached to the wire message — the description is part of @@ -3655,6 +3847,59 @@ mod tests { use super::*; use crate::ai::ollama::{ToolCall, ToolCallFunction}; + #[test] + fn build_tool_definitions_drops_gated_tools() { + let opts = ToolGateOpts { + has_vision: false, + apollo_enabled: false, + daily_summaries_present: false, + calendar_present: false, + location_history_present: false, + faces_present: false, + }; + let tools = InsightGenerator::build_tool_definitions(opts); + let names: Vec<&str> = tools.iter().map(|t| t.function.name.as_str()).collect(); + + // Always-on tools survive. + assert!(names.contains(&"search_messages")); + assert!(names.contains(&"get_sms_messages")); + assert!(names.contains(&"get_file_tags")); + assert!(names.contains(&"reverse_geocode")); + assert!(names.contains(&"get_current_datetime")); + assert!(names.contains(&"recall_entities")); + assert!(names.contains(&"recall_facts_for_photo")); + assert!(names.contains(&"store_entity")); + assert!(names.contains(&"store_fact")); + + // Gated tools are absent. + assert!(!names.contains(&"describe_photo")); + assert!(!names.contains(&"get_personal_place_at")); + assert!(!names.contains(&"search_rag")); + assert!(!names.contains(&"get_calendar_events")); + assert!(!names.contains(&"get_location_history")); + assert!(!names.contains(&"get_faces_in_photo")); + } + + #[test] + fn build_tool_definitions_includes_gated_tools_when_present() { + let opts = ToolGateOpts { + has_vision: true, + apollo_enabled: true, + daily_summaries_present: true, + calendar_present: true, + location_history_present: true, + faces_present: true, + }; + let tools = InsightGenerator::build_tool_definitions(opts); + let names: Vec<&str> = tools.iter().map(|t| t.function.name.as_str()).collect(); + assert!(names.contains(&"describe_photo")); + assert!(names.contains(&"get_personal_place_at")); + assert!(names.contains(&"search_rag")); + assert!(names.contains(&"get_calendar_events")); + assert!(names.contains(&"get_location_history")); + assert!(names.contains(&"get_faces_in_photo")); + } + fn place(name: &str, description: &str) -> ApolloPlace { ApolloPlace { id: 1, @@ -3923,6 +4168,44 @@ mod tests { assert_eq!(out, "11 chars"); } + #[test] + fn build_system_content_uses_custom_prompt_verbatim_for_identity() { + let out = InsightGenerator::build_system_content( + Some("You are a journal writer in first person, warm and reflective."), + None, + "", + 6, + ); + assert!( + out.starts_with("You are a journal writer in first person, warm and reflective."), + "custom prompt must lead the system content; got: {}", + &out[..out.len().min(200)], + ); + assert!( + !out.contains("personal photo memory assistant"), + "framework identity must not leak when custom prompt is supplied" + ); + assert!(out.contains("Tool-use guidance")); + assert!(out.contains("budget of 6")); + } + + #[test] + fn build_system_content_uses_neutral_default_when_no_custom() { + let out = InsightGenerator::build_system_content(None, None, "", 6); + assert!(out.contains("reconstructing a memory from a photo")); + assert!(!out.contains("personal photo memory assistant")); + assert!(out.contains("Tool-use guidance")); + } + + #[test] + fn build_system_content_includes_fewshot_and_owner_id() { + let owner = "\n\nYour identity in the knowledge store: Alice (entity ID: 7)."; + let fewshot = "## Examples\n\n### Example 1\n...\n\n---\n\n"; + let out = InsightGenerator::build_system_content(None, Some(owner), fewshot, 6); + assert!(out.contains("Alice (entity ID: 7)")); + assert!(out.contains("## Examples")); + } + #[test] fn render_fewshot_empty_returns_empty_string() { assert!(InsightGenerator::render_fewshot_examples(&[]).is_empty()); diff --git a/src/ai/sms_client.rs b/src/ai/sms_client.rs index ad6d28e..b59c266 100644 --- a/src/ai/sms_client.rs +++ b/src/ai/sms_client.rs @@ -20,31 +20,36 @@ impl SmsApiClient { } } - /// Fetch messages for a specific contact within ±4 days of the given timestamp - /// Falls back to all contacts if no messages found for the specific contact - /// Messages are sorted by proximity to the center timestamp + /// Compute a `[start, end]` unix-second window of `2 * radius_days` + /// centered on `center_ts`. `radius_days < 1` is clamped to 1 to avoid + /// degenerate zero-width windows. + pub(crate) fn window_for_radius(center_ts: i64, radius_days: i64) -> (i64, i64) { + let r = radius_days.max(1); + let span = r * 86400; + (center_ts - span, center_ts + span) + } + + /// Fetch messages for a specific contact within ±`radius_days` of the + /// given timestamp. Falls back to all contacts when no messages found + /// for the named contact. Sorted by proximity to the center timestamp. pub async fn fetch_messages_for_contact( &self, contact: Option<&str>, center_timestamp: i64, + radius_days: i64, ) -> Result> { - use chrono::Duration; + let effective_radius = radius_days.max(1); + let (start_ts, end_ts) = Self::window_for_radius(center_timestamp, radius_days); - // Calculate ±4 days range around the center timestamp let center_dt = chrono::DateTime::from_timestamp(center_timestamp, 0) .ok_or_else(|| anyhow::anyhow!("Invalid timestamp"))?; - let start_dt = center_dt - Duration::days(4); - let end_dt = center_dt + Duration::days(4); - - let start_ts = start_dt.timestamp(); - let end_ts = end_dt.timestamp(); - // If contact specified, try fetching for that contact first if let Some(contact_name) = contact { log::info!( - "Fetching SMS for contact: {} (±4 days from {})", + "Fetching SMS for contact: {} (±{} days from {})", contact_name, + effective_radius, center_dt.format("%Y-%m-%d %H:%M:%S") ); let messages = self @@ -68,7 +73,8 @@ impl SmsApiClient { // Fallback to all contacts log::info!( - "Fetching all SMS messages (±4 days from {})", + "Fetching all SMS messages (±{} days from {})", + effective_radius, center_dt.format("%Y-%m-%d %H:%M:%S") ); self.fetch_messages(start_ts, end_ts, None, Some(center_timestamp)) @@ -255,19 +261,26 @@ impl SmsApiClient { /// - "fts5" keyword-only, supports phrase / prefix / boolean / NEAR /// - "semantic" embedding similarity /// - "hybrid" both merged via reciprocal rank fusion (recommended) - pub async fn search_messages( + /// + /// The SMS-API endpoint accepts `contact_id` natively; date filtering is + /// the caller's responsibility (post-filter on the returned rows). + pub async fn search_messages_with_contact( &self, query: &str, mode: &str, limit: usize, + contact_id: Option, ) -> Result> { - let url = format!( + let mut url = format!( "{}/api/messages/search/?q={}&mode={}&limit={}", self.base_url, urlencoding::encode(query), urlencoding::encode(mode), limit ); + if let Some(cid) = contact_id { + url.push_str(&format!("&contact_id={}", cid)); + } let mut request = self.client.get(&url); if let Some(token) = &self.token { @@ -379,3 +392,29 @@ struct SmsSearchResponse { #[serde(default)] search_method: String, } + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn window_for_radius_produces_2n_day_span() { + let center: i64 = 1_700_000_000; + let (start, end) = SmsApiClient::window_for_radius(center, 7); + assert_eq!(end - start, 14 * 86400); + assert_eq!(start + 7 * 86400, center); + assert_eq!(end - 7 * 86400, center); + } + + #[test] + fn window_for_radius_clamps_zero_to_one() { + let (start, end) = SmsApiClient::window_for_radius(100_000, 0); + assert_eq!(end - start, 2 * 86400); + } + + #[test] + fn window_for_radius_clamps_negative_to_one() { + let (start, end) = SmsApiClient::window_for_radius(100_000, -7); + assert_eq!(end - start, 2 * 86400); + } +} diff --git a/src/bin/populate_knowledge.rs b/src/bin/populate_knowledge.rs index 9c55e60..c2d1ade 100644 --- a/src/bin/populate_knowledge.rs +++ b/src/bin/populate_knowledge.rs @@ -14,6 +14,7 @@ use image_api::database::{ SqliteInsightDao, SqliteKnowledgeDao, SqliteLocationHistoryDao, SqliteSearchHistoryDao, connect, }; +use image_api::faces::{FaceDao, SqliteFaceDao}; use image_api::file_types::{IMAGE_EXTENSIONS, VIDEO_EXTENSIONS}; use image_api::libraries::{self, Library}; use image_api::tags::{SqliteTagDao, TagDao}; @@ -182,6 +183,8 @@ async fn main() -> anyhow::Result<()> { Arc::new(Mutex::new(Box::new(SqliteTagDao::default()))); let knowledge_dao: Arc>> = Arc::new(Mutex::new(Box::new(SqliteKnowledgeDao::new()))); + let face_dao: Arc>> = + Arc::new(Mutex::new(Box::new(SqliteFaceDao::new()))); // Pass the full library set so `resolve_full_path` probes every root, // even when --library restricts the walk. A rel_path shared across @@ -198,6 +201,7 @@ async fn main() -> anyhow::Result<()> { location_dao, search_dao, tag_dao, + face_dao, knowledge_dao, all_libs.clone(), ); diff --git a/src/database/daily_summary_dao.rs b/src/database/daily_summary_dao.rs index 276a5a2..ec2a161 100644 --- a/src/database/daily_summary_dao.rs +++ b/src/database/daily_summary_dao.rs @@ -75,6 +75,11 @@ pub trait DailySummaryDao: Sync + Send { context: &opentelemetry::Context, contact: &str, ) -> Result; + + /// Cheap presence check — returns true iff at least one daily summary row + /// exists. Used by gating logic that only needs "is the table empty?", + /// avoiding a `COUNT(*)` full scan on large corpora. + fn has_any_summaries(&mut self, context: &opentelemetry::Context) -> Result; } pub struct SqliteDailySummaryDao { @@ -454,6 +459,30 @@ impl DailySummaryDao for SqliteDailySummaryDao { }) .map_err(|_| DbError::new(DbErrorKind::QueryError)) } + + fn has_any_summaries(&mut self, context: &opentelemetry::Context) -> Result { + trace_db_call(context, "query", "has_any_summaries", |_span| { + let mut conn = self + .connection + .lock() + .expect("Unable to get DailySummaryDao"); + + #[derive(QueryableByName)] + struct ProbeResult { + #[diesel(sql_type = diesel::sql_types::Integer)] + #[allow(dead_code)] + one: i32, + } + + let rows: Vec = + diesel::sql_query("SELECT 1 as one FROM daily_conversation_summaries LIMIT 1") + .load(conn.deref_mut()) + .map_err(|e| anyhow::anyhow!("Failed to probe daily summaries: {}", e))?; + + Ok(!rows.is_empty()) + }) + .map_err(|_| DbError::new(DbErrorKind::QueryError)) + } } // Helper structs for raw SQL queries diff --git a/src/faces.rs b/src/faces.rs index fb2fb87..5b7a232 100644 --- a/src/faces.rs +++ b/src/faces.rs @@ -503,6 +503,10 @@ pub trait FaceDao: Send + Sync { into: i32, ) -> anyhow::Result; + /// Cheap presence probe — returns true iff at least one face has been + /// detected (excluding marker rows). Used by chat-tool gating. + fn has_any_faces(&mut self, ctx: &opentelemetry::Context) -> anyhow::Result; + /// Resolve `(library_id, rel_path)` → `content_hash` via image_exif. /// Returns None when the photo hasn't been EXIF-indexed yet (no row /// in image_exif) or when the row exists but content_hash is NULL. @@ -1432,6 +1436,19 @@ impl FaceDao for SqliteFaceDao { }) } + fn has_any_faces(&mut self, ctx: &opentelemetry::Context) -> anyhow::Result { + let mut conn = self.connection.lock().expect("face dao lock"); + trace_db_call(ctx, "query", "has_any_faces", |_span| { + face_detections::table + .filter(face_detections::status.eq("detected")) + .select(face_detections::id) + .first::(conn.deref_mut()) + .optional() + .map(|x| x.is_some()) + .with_context(|| "has_any_faces query") + }) + } + fn resolve_content_hash( &mut self, ctx: &opentelemetry::Context, diff --git a/src/files.rs b/src/files.rs index 7db246f..90d6eca 100644 --- a/src/files.rs +++ b/src/files.rs @@ -1718,7 +1718,12 @@ mod tests { // Mock — files.rs tests don't exercise the date-override endpoints. // Returning a synthetic row keeps the trait satisfied without // depending on private DbError constructors. - Ok(mock_exif_row(library_id, rel_path, Some(date_taken), Some("manual".to_string()))) + Ok(mock_exif_row( + library_id, + rel_path, + Some(date_taken), + Some("manual".to_string()), + )) } fn clear_manual_date_taken( diff --git a/src/main.rs b/src/main.rs index 68f8f61..f998160 100644 --- a/src/main.rs +++ b/src/main.rs @@ -995,10 +995,8 @@ async fn upload_image( } }; let perceptual = perceptual_hash::compute(&uploaded_path); - let resolved_date = date_resolver::resolve_date_taken( - &uploaded_path, - exif_data.date_taken, - ); + let resolved_date = + date_resolver::resolve_date_taken(&uploaded_path, exif_data.date_taken); let insert_exif = InsertImageExif { library_id: target_library.id, file_path: relative_path.clone(), @@ -1022,8 +1020,7 @@ async fn upload_image( size_bytes, phash_64: perceptual.map(|h| h.phash_64), dhash_64: perceptual.map(|h| h.dhash_64), - date_taken_source: resolved_date - .map(|r| r.source.as_str().to_string()), + date_taken_source: resolved_date.map(|r| r.source.as_str().to_string()), }; if let Ok(mut dao) = exif_dao.lock() { @@ -1687,7 +1684,16 @@ fn create_thumbnails(libs: &[libraries::Library], excluded_dirs: &[String]) { ]); debug!("Generating video thumbnail: {:?}", thumb_path); - generate_video_thumbnail(src, &thumb_path); + if let Err(e) = generate_video_thumbnail(src, &thumb_path) { + let sentinel = unsupported_thumbnail_sentinel(&thumb_path); + error!( + "Unable to thumbnail video {:?}: {}. Writing sentinel {:?}", + src, e, sentinel + ); + if let Err(se) = std::fs::write(&sentinel, b"") { + warn!("Failed to write sentinel {:?}: {}", sentinel, se); + } + } video_span.end(); } else if is_image(&entry) { match generate_image_thumbnail(src, &thumb_path) { diff --git a/src/memories.rs b/src/memories.rs index 0153d2b..d31cbb8 100644 --- a/src/memories.rs +++ b/src/memories.rs @@ -213,10 +213,7 @@ pub fn extract_date_from_filename(filename: &str) -> Option>> = Arc::new(Mutex::new(Box::new(SqliteKnowledgeDao::new()))); + let face_dao: Arc>> = + Arc::new(Mutex::new(Box::new(faces::SqliteFaceDao::new()))); // Load base path and ensure the primary library row reflects it. let base_path = env::var("BASE_PATH").expect("BASE_PATH was not set in the env"); @@ -231,6 +234,7 @@ impl Default for AppState { location_dao.clone(), search_dao.clone(), tag_dao.clone(), + face_dao.clone(), knowledge_dao, libraries_vec.clone(), ); @@ -348,6 +352,8 @@ impl AppState { Arc::new(Mutex::new(Box::new(SqliteTagDao::default()))); let knowledge_dao: Arc>> = Arc::new(Mutex::new(Box::new(SqliteKnowledgeDao::new()))); + let face_dao: Arc>> = + Arc::new(Mutex::new(Box::new(faces::SqliteFaceDao::new()))); // Initialize test InsightGenerator with all data sources let base_path_str = base_path.to_string_lossy().to_string(); @@ -370,6 +376,7 @@ impl AppState { location_dao.clone(), search_dao.clone(), tag_dao.clone(), + face_dao.clone(), knowledge_dao, vec![test_lib], ); diff --git a/src/video/actors.rs b/src/video/actors.rs index c7300ef..cfe2aa7 100644 --- a/src/video/actors.rs +++ b/src/video/actors.rs @@ -107,19 +107,39 @@ pub async fn create_playlist(video_path: &str, playlist_file: &str) -> Result std::io::Result<()> { + // -vf scale + -c:v mjpeg mirrors `generate_image_thumbnail_ffmpeg`. The + // filter chain matters as much as the scale does: without it, ffmpeg + // hands the decoded frame straight to the mjpeg encoder, which rejects + // any non-yuvj420p source ("Non full-range YUV is non-standard"). The + // filter chain lets ffmpeg auto-insert the pix_fmt converter the + // encoder needs, which is how the image-thumbnail path already handles + // the same class of source. + let output = Command::new("ffmpeg") + .arg("-y") .arg("-ss") .arg("3") .arg("-i") - .arg(path.to_str().unwrap()) + .arg(path) .arg("-vframes") .arg("1") + .arg("-vf") + .arg("scale=200:-1") .arg("-f") .arg("image2") + .arg("-c:v") + .arg("mjpeg") .arg(destination) - .output() - .expect("Failure to create video frame"); + .output()?; + + if !output.status.success() { + return Err(std::io::Error::other(format!( + "ffmpeg failed ({}): {}", + output.status, + String::from_utf8_lossy(&output.stderr).trim() + ))); + } + Ok(()) } /// Use ffmpeg to extract a 200px-wide thumbnail from formats the `image` crate