From 10ba706b39d6bebf24ebc1ccf3bf3c651e53a59c Mon Sep 17 00:00:00 2001
From: Cameron Cordes <cameronc.dev@gmail.com>
Date: Thu, 7 May 2026 10:36:05 -0400
Subject: [PATCH] ai: reframe iteration budget as capacity, not constraint
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Small models (~8B) were bailing out of the agentic loop after one or two
tool calls under the previous "hard budget … stop when nearly exhausted"
phrasing. They read that as a conservation directive and the
"trivial photos may need fewer" clause gave them an easy out.

Flipped both the agentic and chat-turn prompts to frame the budget as
capacity to spend, with a soft floor (≥5 tool calls before writing) and
an explicit reserve clause for the final reply. Big models will still
deviate when warranted; small models follow the floor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 src/ai/insight_chat.rs      | 4 +++-
 src/ai/insight_generator.rs | 5 ++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/ai/insight_chat.rs b/src/ai/insight_chat.rs
index 8739b48..58414ca 100644
--- a/src/ai/insight_chat.rs
+++ b/src/ai/insight_chat.rs
@@ -1150,8 +1150,10 @@ fn annotate_system_with_budget(
     let original = first.content.clone();
     // Formatted as its own section so small models don't skim past it the
     // way they tend to with parenthetical asides at the bottom of a long prompt.
+    // Phrasing matches the base prompt: budget = capacity, not a constraint
+    // to conserve. Small models otherwise tend to stop early.
     first.content = format!(
-        "{}\n\n## Budget for this chat turn\n\nYou have up to {} tool-calling iterations. Stop calling tools and write the final reply before the budget is exhausted.",
+        "{}\n\n## Budget for this chat turn\n\nYou have up to {} iterations available. Use as many as the question warrants for context-gathering, and reserve the last one for your reply.",
         first.content, max_iterations
     );
     Some(original)
diff --git a/src/ai/insight_generator.rs b/src/ai/insight_generator.rs
index eed0949..2b8a9d7 100644
--- a/src/ai/insight_generator.rs
+++ b/src/ai/insight_generator.rs
@@ -3708,12 +3708,11 @@ Return ONLY the summary, nothing else."#,
             "You are a personal photo memory assistant helping to reconstruct a memory from a photo.{owner_id_note}\n\n\
             {fewshot_block}\
             ## Tool-use guidance\n\
-            - Spend most of your iteration budget on context-gathering tools before writing the final insight. A typical strong run uses 4–8 tool calls; trivial photos may need fewer, rich photos more.\n\
+            - You have up to {max_iterations} iterations available. Aim to use at least 5 of them on context-gathering tools before writing — only skip context-gathering when the photo is genuinely trivial (e.g. a screenshot of a receipt). Reserve your last 1–2 iterations for writing the final insight.\n\
             - When you call get_sms_messages or search_rag and you know the contact, also make at least one call WITHOUT a contact filter so you can see what else was happening in {owner_name}'s life around this date.\n\
             - Use recall_facts_for_photo and recall_entities early to load any prior knowledge about subjects in this photo.\n\
             - {knowledge_write_line}\n\
-            - If a tool returns no results, that's useful information — pivot to a different tool, don't retry the same call.\n\
-            - You have a hard budget of {max_iterations} tool-calling iterations. When the budget is nearly exhausted, stop calling tools and write the final insight.",
+            - If a tool returns no results, that's useful information — pivot to a different tool, don't retry the same call.",
             owner_id_note = owner_id_note,
             fewshot_block = fewshot_block,
             owner_name = owner_name,