Small models (~8B) were producing generic responses regardless of
persona, and bailing out of the agentic loop on iteration 1. Two
underlying causes:
1. Photo facts (date, location, contact, tags, visual) were buried
between "Please analyze this photo" preamble and "Use the available
tools" outro. Small models skim and miss them, which is why outputs
weren't anchoring to the actual photo.
2. The user message ended with "write a detailed insight" — small
models took the path of least resistance and just wrote, ignoring
the soft "aim to use 5 tools" floor in the system prompt.
Restructured the user message:
- Leads with a "## This photo" bulleted block so the metadata is
visible top-down. File path, date+source, contact, location+GPS,
tags, and (in hybrid) the visual description are all bullets the
model can't skim past.
- Replaces the prose body with a numbered "## What to do" recipe:
(1) recall_facts_for_photo + recall_entities, (2) ≥3 of the
time-window tools, (3) write only after tool results, referencing
specific facts. "Generic narration is not acceptable" is explicit.
- Ends with a hard forcing line: "YOUR FIRST RESPONSE MUST BE A TOOL
CALL. Do not output any final answer text until you have called at
least 5 tools." Replaces the soft "aim to" floor with a directive
small models actually follow.
Tradeoff: big models also follow the recipe literally and may call
5 tools when 3 would do. Optimizing for the small-model floor first;
soften once that's working.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>