ai: reframe iteration budget as capacity, not constraint

Small models (~8B) were bailing out of the agentic loop after one or two tool calls under the previous "hard budget … stop when nearly exhausted" phrasing. They read that as a conservation directive and the "trivial photos may need fewer" clause gave them an easy out. Flipped both the agentic and chat-turn prompts to frame the budget as capacity to spend, with a soft floor (≥5 tool calls before writing) and an explicit reserve clause for the final reply. Big models will still deviate when warranted; small models follow the floor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 10:36:05 -04:00
parent 9071d05932
commit 10ba706b39
2 changed files with 5 additions and 4 deletions
@@ -1150,8 +1150,10 @@ fn annotate_system_with_budget(
    let original = first.content.clone();
    // Formatted as its own section so small models don't skim past it the
    // way they tend to with parenthetical asides at the bottom of a long prompt.
+    // Phrasing matches the base prompt: budget = capacity, not a constraint
+    // to conserve. Small models otherwise tend to stop early.
    first.content = format!(
-        "{}\n\n## Budget for this chat turn\n\nYou have up to {} tool-calling iterations. Stop calling tools and write the final reply before the budget is exhausted.",
+        "{}\n\n## Budget for this chat turn\n\nYou have up to {} iterations available. Use as many as the question warrants for context-gathering, and reserve the last one for your reply.",
        first.content, max_iterations
    );
    Some(original)
@@ -3708,12 +3708,11 @@ Return ONLY the summary, nothing else."#,
            "You are a personal photo memory assistant helping to reconstruct a memory from a photo.{owner_id_note}\n\n\
            {fewshot_block}\
            ## Tool-use guidance\n\
-            - Spend most of your iteration budget on context-gathering tools before writing the final insight. A typical strong run uses 4–8 tool calls; trivial photos may need fewer, rich photos more.\n\
+            - You have up to {max_iterations} iterations available. Aim to use at least 5 of them on context-gathering tools before writing — only skip context-gathering when the photo is genuinely trivial (e.g. a screenshot of a receipt). Reserve your last 1–2 iterations for writing the final insight.\n\
            - When you call get_sms_messages or search_rag and you know the contact, also make at least one call WITHOUT a contact filter so you can see what else was happening in {owner_name}'s life around this date.\n\
            - Use recall_facts_for_photo and recall_entities early to load any prior knowledge about subjects in this photo.\n\
            - {knowledge_write_line}\n\
-            - If a tool returns no results, that's useful information — pivot to a different tool, don't retry the same call.\n\
-            - You have a hard budget of {max_iterations} tool-calling iterations. When the budget is nearly exhausted, stop calling tools and write the final insight.",
+            - If a tool returns no results, that's useful information — pivot to a different tool, don't retry the same call.",
            owner_id_note = owner_id_note,
            fewshot_block = fewshot_block,
            owner_name = owner_name,