Compare commits

..

4 Commits

Author SHA1 Message Date
Cameron Cordes
2ff06413c6 ai: restructure agentic user message — facts up top + forcing gate
Small models (~8B) were producing generic responses regardless of
persona, and bailing out of the agentic loop on iteration 1. Two
underlying causes:

1. Photo facts (date, location, contact, tags, visual) were buried
   between "Please analyze this photo" preamble and "Use the available
   tools" outro. Small models skim and miss them, which is why outputs
   weren't anchoring to the actual photo.

2. The user message ended with "write a detailed insight" — small
   models took the path of least resistance and just wrote, ignoring
   the soft "aim to use 5 tools" floor in the system prompt.

Restructured the user message:

- Leads with a "## This photo" bulleted block so the metadata is
  visible top-down. File path, date+source, contact, location+GPS,
  tags, and (in hybrid) the visual description are all bullets the
  model can't skim past.
- Replaces the prose body with a numbered "## What to do" recipe:
  (1) recall_facts_for_photo + recall_entities, (2) ≥3 of the
  time-window tools, (3) write only after tool results, referencing
  specific facts. "Generic narration is not acceptable" is explicit.
- Ends with a hard forcing line: "YOUR FIRST RESPONSE MUST BE A TOOL
  CALL. Do not output any final answer text until you have called at
  least 5 tools." Replaces the soft "aim to" floor with a directive
  small models actually follow.

Tradeoff: big models also follow the recipe literally and may call
5 tools when 3 would do. Optimizing for the small-model floor first;
soften once that's working.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 10:59:39 -04:00
Cameron Cordes
66ea8490ab backfill_date_taken: surface the actual diesel error in warnings
The DAO swallowed every diesel::update failure as a flat
`anyhow!("Update error")`, then trace_db_call further reduced it to
`DbError { kind: UpdateError }`. Operators saw "update failed for lib
2 Snapchat/foo.mp4: DbError { kind: UpdateError }" with no clue why
(constraint violation? type mismatch? row vanished mid-flight? DB
locked?).

Two changes:
- Preserve the diesel error in the anyhow chain along with the input
  params (lib, rel_path, date_taken, source) so the cause is visible.
- Log the chain at warn-level inside the DAO before the trace wrapper
  collapses it to DbErrorKind::UpdateError, so the warning at the
  call site finally has something diagnosable next to it.
- Treat zero-row updates as a debug-level "row likely retired by the
  missing-file scan" rather than a hard failure — that case is benign
  and shouldn't poison the drain's error tally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 10:41:09 -04:00
Cameron Cordes
10ba706b39 ai: reframe iteration budget as capacity, not constraint
Small models (~8B) were bailing out of the agentic loop after one or two
tool calls under the previous "hard budget … stop when nearly exhausted"
phrasing. They read that as a conservation directive and the
"trivial photos may need fewer" clause gave them an easy out.

Flipped both the agentic and chat-turn prompts to frame the budget as
capacity to spend, with a soft floor (≥5 tool calls before writing) and
an explicit reserve clause for the final reply. Big models will still
deviate when warranted; small models follow the floor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 10:36:05 -04:00
Cameron Cordes
9071d05932 ai: insight tools audit — bug fixes, new tools, prompt structure
Bug fixes:
- get_sms_messages.days_radius is now actually honored (was hardcoded
  to ±4d in SmsApiClient::fetch_messages_for_contact).
- describe_photo memoized for the lifetime of one agentic loop / one
  chat turn — re-running mid-loop produced conflicting visual
  descriptions in the transcript.

Agentic user message:
- Pre-resolve location via Apollo + Nominatim and emit one Location:
  line instead of bare GPS, mirroring the non-agentic flow.
- Date now formats with weekday + canonical-date source so the model
  can hedge on fs_time-derived dates.
- Hybrid mode visual block tells the model not to call describe_photo
  (the tool is already gated off in hybrid).

System prompt structure:
- custom_system_prompt now appends under an explicit "User overrides
  (these take precedence)" heading instead of prepending — so a custom
  voice/POV/format prompt actually beats the built-in defaults.
- Numbered rules collapsed into bulleted "Tool-use guidance"; merged
  the contradictory "multiple tools BEFORE" / "after 5 calls" rules.
- Chat budget annotation surfaces as its own ## heading.

New tools:
- recall_facts_for_entity(entity_id|name) — facts for one entity
  without needing a photo path. Fills the "tell me about Sarah" chat
  case where recall_facts_for_photo doesn't apply.
- find_photos_with_entity(entity_id|name) — "when did I last see X /
  show me photos from the Tahoe trip" via entity_photo_links.
- get_exif(file_path) — full EXIF row for any photo, for technical
  ("what camera was this on?") questions.

Tools removed:
- get_file_tags duplicated the inline Tags: line on the user message;
  exposing both gave models an excuse to "confirm" what they had.

Tool descriptions tightened:
- search_rag now correctly says "per-day, per-contact summaries" and
  explains the date is for time-decay weighting.
- recall_entities warns about empty-filter dumps.
- store_entity / store_fact document dedup return + snake_case
  predicate vocabulary.
- reverse_geocode defers to the pre-resolved location and to
  get_personal_place_at for personal places.
- get_current_datetime narrowed to time-since-photo use.

Calendar / location:
- get_calendar_events accepts query and embeds it for hybrid time +
  semantic ranking (was always passing None for the embedding).
- get_location_history exposes limit; description tells the model
  there's no semantic ranking on this surface.

New disable_writes flag:
- POST /insights/generate/agentic and the chat endpoints accept
  disable_writes: bool. When true, drops store_entity / store_fact
  from the tool palette and rewrites the system prompt's knowledge-
  write line. Lets users explore alternate prompts (caption-style,
  third-person, haiku) without polluting the persistent KB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 22:37:32 -04:00
7 changed files with 654 additions and 156 deletions

View File

@@ -48,6 +48,11 @@ pub struct GeneratePhotoInsightRequest {
/// falls back to `DEFAULT_FEWSHOT_INSIGHT_IDS`.
#[serde(default)]
pub fewshot_insight_ids: Option<Vec<i32>>,
/// When true, drop `store_entity` / `store_fact` from the tool palette
/// for this run. Use for one-off explorations (caption-style prompts,
/// experimentation) that shouldn't pollute the persistent knowledge KB.
#[serde(default)]
pub disable_writes: bool,
}
#[derive(Debug, Deserialize)]
@@ -390,6 +395,7 @@ pub async fn generate_agentic_insight_handler(
request.backend.clone(),
fewshot_examples,
fewshot_ids,
request.disable_writes,
)
.await;
@@ -642,6 +648,10 @@ pub struct ChatTurnHttpRequest {
pub max_iterations: Option<usize>,
#[serde(default)]
pub amend: bool,
/// Drop store_entity / store_fact from the tool palette for this turn —
/// useful for hypothetical/exploration chats that shouldn't pollute the KB.
#[serde(default)]
pub disable_writes: bool,
}
#[derive(Debug, Serialize)]
@@ -696,6 +706,7 @@ pub async fn chat_turn_handler(
min_p: request.min_p,
max_iterations: request.max_iterations,
amend: request.amend,
disable_writes: request.disable_writes,
};
match app_state.insight_chat.chat_turn(chat_req).await {
@@ -910,6 +921,7 @@ pub async fn chat_stream_handler(
min_p: request.min_p,
max_iterations: request.max_iterations,
amend: request.amend,
disable_writes: request.disable_writes,
};
let service = app_state.insight_chat.clone();

View File

@@ -48,6 +48,10 @@ pub struct ChatTurnRequest {
/// When true, write a new insight row (regenerating title) instead of
/// updating training_messages on the existing row.
pub amend: bool,
/// When true, drop `store_entity` / `store_fact` from the tool palette
/// for this turn. Use to explore alternate phrasings or run
/// hypothetical chats without polluting the persistent KB.
pub disable_writes: bool,
}
#[derive(Debug)]
@@ -362,6 +366,7 @@ impl InsightChatService {
let tools = InsightGenerator::build_tool_definitions(
offer_describe_tool,
self.generator.apollo_enabled(),
req.disable_writes,
);
// Image base64 only needed when describe_photo is on the menu. Load
@@ -397,6 +402,9 @@ impl InsightChatService {
// tighter and dispatching tools through the shared executor.
let loop_span = tracer.start_with_context("ai.chat.loop", &insight_cx);
let loop_cx = insight_cx.with_span(loop_span);
// Memoize describe_photo for this turn so repeated calls don't
// produce conflicting visual descriptions in the assistant transcript.
let describe_cache: tokio::sync::Mutex<Option<String>> = tokio::sync::Mutex::new(None);
let mut tool_calls_made = 0usize;
let mut iterations_used = 0usize;
let mut last_prompt_eval_count: Option<i32> = None;
@@ -445,6 +453,7 @@ impl InsightChatService {
&image_base64,
&normalized,
&loop_cx,
Some(&describe_cache),
)
.await;
messages.push(ChatMessage::tool_result(result));
@@ -793,6 +802,7 @@ impl InsightChatService {
let tools = InsightGenerator::build_tool_definitions(
offer_describe_tool,
self.generator.apollo_enabled(),
req.disable_writes,
);
let image_base64: Option<String> = if offer_describe_tool {
@@ -814,6 +824,9 @@ impl InsightChatService {
let original_system_content = annotate_system_with_budget(&mut messages, max_iterations);
// Per-turn describe_photo memo, same intent as the non-streaming
// path: avoid replaying conflicting visual descriptions in transcript.
let describe_cache: tokio::sync::Mutex<Option<String>> = tokio::sync::Mutex::new(None);
let mut tool_calls_made = 0usize;
let mut iterations_used = 0usize;
let mut last_prompt_eval_count: Option<i32> = None;
@@ -889,6 +902,7 @@ impl InsightChatService {
&image_base64,
&normalized,
&cx,
Some(&describe_cache),
)
.await;
let (result_preview, result_truncated) = truncate_tool_result(&result);
@@ -1134,8 +1148,12 @@ fn annotate_system_with_budget(
return None;
}
let original = first.content.clone();
// Formatted as its own section so small models don't skim past it the
// way they tend to with parenthetical asides at the bottom of a long prompt.
// Phrasing matches the base prompt: budget = capacity, not a constraint
// to conserve. Small models otherwise tend to stop early.
first.content = format!(
"{}\n\n(Budget for this chat turn: up to {} tool-calling iterations. Produce your final reply before the budget is exhausted.)",
"{}\n\n## Budget for this chat turn\n\nYou have up to {} iterations available. Use as many as the question warrants for context-gathering, and reserve the last one for your reply.",
first.content, max_iterations
);
Some(original)

File diff suppressed because it is too large Load Diff

View File

@@ -20,22 +20,24 @@ impl SmsApiClient {
}
}
/// Fetch messages for a specific contact within ±4 days of the given timestamp
/// Falls back to all contacts if no messages found for the specific contact
/// Messages are sorted by proximity to the center timestamp
/// Fetch messages for a specific contact within ±`days_radius` days of
/// the given timestamp (defaults to ±4 days when `None`). Falls back to
/// all contacts if no messages are found for the specified contact.
/// Messages are sorted by proximity to the center timestamp.
pub async fn fetch_messages_for_contact(
&self,
contact: Option<&str>,
center_timestamp: i64,
days_radius: Option<i64>,
) -> Result<Vec<SmsMessage>> {
use chrono::Duration;
// Calculate ±4 days range around the center timestamp
let radius = days_radius.unwrap_or(4).clamp(1, 30);
let center_dt = chrono::DateTime::from_timestamp(center_timestamp, 0)
.ok_or_else(|| anyhow::anyhow!("Invalid timestamp"))?;
let start_dt = center_dt - Duration::days(4);
let end_dt = center_dt + Duration::days(4);
let start_dt = center_dt - Duration::days(radius);
let end_dt = center_dt + Duration::days(radius);
let start_ts = start_dt.timestamp();
let end_ts = end_dt.timestamp();
@@ -43,8 +45,9 @@ impl SmsApiClient {
// If contact specified, try fetching for that contact first
if let Some(contact_name) = contact {
log::info!(
"Fetching SMS for contact: {} (±4 days from {})",
"Fetching SMS for contact: {} (±{} days from {})",
contact_name,
radius,
center_dt.format("%Y-%m-%d %H:%M:%S")
);
let messages = self
@@ -68,7 +71,8 @@ impl SmsApiClient {
// Fallback to all contacts
log::info!(
"Fetching all SMS messages (±4 days from {})",
"Fetching all SMS messages (±{} days from {})",
radius,
center_dt.format("%Y-%m-%d %H:%M:%S")
);
self.fetch_messages(start_ts, end_ts, None, Some(center_timestamp))

View File

@@ -331,6 +331,7 @@ async fn main() -> anyhow::Result<()> {
None,
Vec::new(),
Vec::new(),
false, // disable_writes — keep KB writes on for the population job
)
.await
{

View File

@@ -1718,7 +1718,12 @@ mod tests {
// Mock — files.rs tests don't exercise the date-override endpoints.
// Returning a synthetic row keeps the trait satisfied without
// depending on private DbError constructors.
Ok(mock_exif_row(library_id, rel_path, Some(date_taken), Some("manual".to_string())))
Ok(mock_exif_row(
library_id,
rel_path,
Some(date_taken),
Some("manual".to_string()),
))
}
fn clear_manual_date_taken(

View File

@@ -995,10 +995,8 @@ async fn upload_image(
}
};
let perceptual = perceptual_hash::compute(&uploaded_path);
let resolved_date = date_resolver::resolve_date_taken(
&uploaded_path,
exif_data.date_taken,
);
let resolved_date =
date_resolver::resolve_date_taken(&uploaded_path, exif_data.date_taken);
let insert_exif = InsertImageExif {
library_id: target_library.id,
file_path: relative_path.clone(),
@@ -1022,8 +1020,7 @@ async fn upload_image(
size_bytes,
phash_64: perceptual.map(|h| h.phash_64),
dhash_64: perceptual.map(|h| h.dhash_64),
date_taken_source: resolved_date
.map(|r| r.source.as_str().to_string()),
date_taken_source: resolved_date.map(|r| r.source.as_str().to_string()),
};
if let Ok(mut dao) = exif_dao.lock() {