From 10ba706b39d6bebf24ebc1ccf3bf3c651e53a59c Mon Sep 17 00:00:00 2001 From: Cameron Cordes Date: Thu, 7 May 2026 10:36:05 -0400 Subject: [PATCH] ai: reframe iteration budget as capacity, not constraint MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Small models (~8B) were bailing out of the agentic loop after one or two tool calls under the previous "hard budget … stop when nearly exhausted" phrasing. They read that as a conservation directive and the "trivial photos may need fewer" clause gave them an easy out. Flipped both the agentic and chat-turn prompts to frame the budget as capacity to spend, with a soft floor (≥5 tool calls before writing) and an explicit reserve clause for the final reply. Big models will still deviate when warranted; small models follow the floor. Co-Authored-By: Claude Opus 4.7 (1M context) --- src/ai/insight_chat.rs | 4 +++- src/ai/insight_generator.rs | 5 ++--- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/src/ai/insight_chat.rs b/src/ai/insight_chat.rs index 8739b48..58414ca 100644 --- a/src/ai/insight_chat.rs +++ b/src/ai/insight_chat.rs @@ -1150,8 +1150,10 @@ fn annotate_system_with_budget( let original = first.content.clone(); // Formatted as its own section so small models don't skim past it the // way they tend to with parenthetical asides at the bottom of a long prompt. + // Phrasing matches the base prompt: budget = capacity, not a constraint + // to conserve. Small models otherwise tend to stop early. first.content = format!( - "{}\n\n## Budget for this chat turn\n\nYou have up to {} tool-calling iterations. Stop calling tools and write the final reply before the budget is exhausted.", + "{}\n\n## Budget for this chat turn\n\nYou have up to {} iterations available. Use as many as the question warrants for context-gathering, and reserve the last one for your reply.", first.content, max_iterations ); Some(original) diff --git a/src/ai/insight_generator.rs b/src/ai/insight_generator.rs index eed0949..2b8a9d7 100644 --- a/src/ai/insight_generator.rs +++ b/src/ai/insight_generator.rs @@ -3708,12 +3708,11 @@ Return ONLY the summary, nothing else."#, "You are a personal photo memory assistant helping to reconstruct a memory from a photo.{owner_id_note}\n\n\ {fewshot_block}\ ## Tool-use guidance\n\ - - Spend most of your iteration budget on context-gathering tools before writing the final insight. A typical strong run uses 4–8 tool calls; trivial photos may need fewer, rich photos more.\n\ + - You have up to {max_iterations} iterations available. Aim to use at least 5 of them on context-gathering tools before writing — only skip context-gathering when the photo is genuinely trivial (e.g. a screenshot of a receipt). Reserve your last 1–2 iterations for writing the final insight.\n\ - When you call get_sms_messages or search_rag and you know the contact, also make at least one call WITHOUT a contact filter so you can see what else was happening in {owner_name}'s life around this date.\n\ - Use recall_facts_for_photo and recall_entities early to load any prior knowledge about subjects in this photo.\n\ - {knowledge_write_line}\n\ - - If a tool returns no results, that's useful information — pivot to a different tool, don't retry the same call.\n\ - - You have a hard budget of {max_iterations} tool-calling iterations. When the budget is nearly exhausted, stop calling tools and write the final insight.", + - If a tool returns no results, that's useful information — pivot to a different tool, don't retry the same call.", owner_id_note = owner_id_note, fewshot_block = fewshot_block, owner_name = owner_name,