Small models (~8B) were producing generic responses regardless of
persona, and bailing out of the agentic loop on iteration 1. Two
underlying causes:
1. Photo facts (date, location, contact, tags, visual) were buried
between "Please analyze this photo" preamble and "Use the available
tools" outro. Small models skim and miss them, which is why outputs
weren't anchoring to the actual photo.
2. The user message ended with "write a detailed insight" — small
models took the path of least resistance and just wrote, ignoring
the soft "aim to use 5 tools" floor in the system prompt.
Restructured the user message:
- Leads with a "## This photo" bulleted block so the metadata is
visible top-down. File path, date+source, contact, location+GPS,
tags, and (in hybrid) the visual description are all bullets the
model can't skim past.
- Replaces the prose body with a numbered "## What to do" recipe:
(1) recall_facts_for_photo + recall_entities, (2) ≥3 of the
time-window tools, (3) write only after tool results, referencing
specific facts. "Generic narration is not acceptable" is explicit.
- Ends with a hard forcing line: "YOUR FIRST RESPONSE MUST BE A TOOL
CALL. Do not output any final answer text until you have called at
least 5 tools." Replaces the soft "aim to" floor with a directive
small models actually follow.
Tradeoff: big models also follow the recipe literally and may call
5 tools when 3 would do. Optimizing for the small-model floor first;
soften once that's working.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The DAO swallowed every diesel::update failure as a flat
`anyhow!("Update error")`, then trace_db_call further reduced it to
`DbError { kind: UpdateError }`. Operators saw "update failed for lib
2 Snapchat/foo.mp4: DbError { kind: UpdateError }" with no clue why
(constraint violation? type mismatch? row vanished mid-flight? DB
locked?).
Two changes:
- Preserve the diesel error in the anyhow chain along with the input
params (lib, rel_path, date_taken, source) so the cause is visible.
- Log the chain at warn-level inside the DAO before the trace wrapper
collapses it to DbErrorKind::UpdateError, so the warning at the
call site finally has something diagnosable next to it.
- Treat zero-row updates as a debug-level "row likely retired by the
missing-file scan" rather than a hard failure — that case is benign
and shouldn't poison the drain's error tally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Small models (~8B) were bailing out of the agentic loop after one or two
tool calls under the previous "hard budget … stop when nearly exhausted"
phrasing. They read that as a conservation directive and the
"trivial photos may need fewer" clause gave them an easy out.
Flipped both the agentic and chat-turn prompts to frame the budget as
capacity to spend, with a soft floor (≥5 tool calls before writing) and
an explicit reserve clause for the final reply. Big models will still
deviate when warranted; small models follow the floor.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bug fixes:
- get_sms_messages.days_radius is now actually honored (was hardcoded
to ±4d in SmsApiClient::fetch_messages_for_contact).
- describe_photo memoized for the lifetime of one agentic loop / one
chat turn — re-running mid-loop produced conflicting visual
descriptions in the transcript.
Agentic user message:
- Pre-resolve location via Apollo + Nominatim and emit one Location:
line instead of bare GPS, mirroring the non-agentic flow.
- Date now formats with weekday + canonical-date source so the model
can hedge on fs_time-derived dates.
- Hybrid mode visual block tells the model not to call describe_photo
(the tool is already gated off in hybrid).
System prompt structure:
- custom_system_prompt now appends under an explicit "User overrides
(these take precedence)" heading instead of prepending — so a custom
voice/POV/format prompt actually beats the built-in defaults.
- Numbered rules collapsed into bulleted "Tool-use guidance"; merged
the contradictory "multiple tools BEFORE" / "after 5 calls" rules.
- Chat budget annotation surfaces as its own ## heading.
New tools:
- recall_facts_for_entity(entity_id|name) — facts for one entity
without needing a photo path. Fills the "tell me about Sarah" chat
case where recall_facts_for_photo doesn't apply.
- find_photos_with_entity(entity_id|name) — "when did I last see X /
show me photos from the Tahoe trip" via entity_photo_links.
- get_exif(file_path) — full EXIF row for any photo, for technical
("what camera was this on?") questions.
Tools removed:
- get_file_tags duplicated the inline Tags: line on the user message;
exposing both gave models an excuse to "confirm" what they had.
Tool descriptions tightened:
- search_rag now correctly says "per-day, per-contact summaries" and
explains the date is for time-decay weighting.
- recall_entities warns about empty-filter dumps.
- store_entity / store_fact document dedup return + snake_case
predicate vocabulary.
- reverse_geocode defers to the pre-resolved location and to
get_personal_place_at for personal places.
- get_current_datetime narrowed to time-since-photo use.
Calendar / location:
- get_calendar_events accepts query and embeds it for hybrid time +
semantic ranking (was always passing None for the embedding).
- get_location_history exposes limit; description tells the model
there's no semantic ranking on this surface.
New disable_writes flag:
- POST /insights/generate/agentic and the chat endpoints accept
disable_writes: bool. When true, drops store_entity / store_fact
from the tool palette and rewrites the system prompt's knowledge-
write line. Lets users explore alternate prompts (caption-style,
third-person, haiku) without polluting the persistent KB.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 22:37:32 -04:00
7 changed files with 654 additions and 156 deletions
// Formatted as its own section so small models don't skim past it the
// way they tend to with parenthetical asides at the bottom of a long prompt.
// Phrasing matches the base prompt: budget = capacity, not a constraint
// to conserve. Small models otherwise tend to stop early.
first.content=format!(
"{}\n\n(Budget for this chat turn: up to {} tool-calling iterations. Produce your final reply before the budget is exhausted.)",
"{}\n\n## Budget for this chat turn\n\nYou have up to {} iterations available. Use as many as the question warrants for context-gathering, and reserve the last one for your reply.",
"Search conversation history using semantic search. Use this to find relevant past conversations about specific topics, people, or events.",
"Semantic search over per-day, per-contact CONVERSATION SUMMARIES (not raw messages). Each hit is one compressed paragraph for one (date, contact) day. Use for high-level themes around a date — for specific wording, call `search_messages`. The `date` argument biases ranking toward summaries near that date and is required even when searching across all time.",
serde_json::json!({
"type": "object",
"required": ["query","date"],
"properties": {
"query": {
"type": "string",
"description": "The search query to find relevant conversations"
"description": "The reference date in YYYY-MM-DD format"
"description": "Reference date in YYYY-MM-DD format. Used for time-decay weighting (closer summaries rank higher)."
},
"contact": {
"type": "string",
"description": "Optional contact name to filter results"
"description": "Optional contact name to filter results. When you know the contact, also make at least one call WITHOUT this filter to surface what else was happening that week."
},
"limit": {
"type": "integer",
@@ -2564,7 +2853,7 @@ Return ONLY the summary, nothing else."#,
),
Tool::function(
"get_calendar_events",
"Fetch calendar events near a specific date. Shows scheduled events, meetings, and activities.",
"Fetch calendar events near a specific date. Pass a `query` to rank by semantic similarity (e.g. 'wedding', 'doctor visit', 'work travel') in addition to time. Without a query, results are time-ordered.",
serde_json::json!({
"type": "object",
"required": ["date"],
@@ -2573,6 +2862,10 @@ Return ONLY the summary, nothing else."#,
"type": "string",
"description": "The center date in YYYY-MM-DD format"
},
"query": {
"type": "string",
"description": "Optional topic / theme to rank events by (semantic). Pairs well with photo-relevant cues from the visual description."
},
"days_radius": {
"type": "integer",
"description": "Number of days before and after the date to search (default: 7)"
@@ -2586,7 +2879,7 @@ Return ONLY the summary, nothing else."#,
),
Tool::function(
"get_location_history",
"Fetch locationhistory records near a specific date. Shows places visited and activities.",
"Fetch location-history records (lat/lon / activity / place_name) within ±days_radius days of a date. Time-ordered; no semantic ranking. Useful for reconstructing where the user was around the photo's timestamp.",
serde_json::json!({
"type": "object",
"required": ["date"],
@@ -2598,29 +2891,22 @@ Return ONLY the summary, nothing else."#,
"days_radius": {
"type": "integer",
"description": "Number of days before and after the date to search (default: 14)"
}
}
}),
),
Tool::function(
"get_file_tags",
"Get tags/labels that have been applied to a specific photo file.",
serde_json::json!({
"type": "object",
"required": ["file_path"],
"properties": {
"file_path": {
"type": "string",
"description": "The file path of the photo to get tags for"
},
"limit": {
"type": "integer",
"description": "Maximum number of records to return (default: 20, max: 50)"
}
}
}),
),
];
// (`get_file_tags` was removed — the tags for the current photo are
// already inlined in the user message, and exposing both gave models
// an excuse to spend an iteration "confirming" what they already had.)
tools.push(Tool::function(
"reverse_geocode",
"Convert GPS latitude/longitude coordinates to a human-readable place name (city, state). Use this when GPS coordinates are available in the photo metadata, or to resolve coordinates returned by get_location_history.",
"Convert GPS latitude/longitude to a city/state place name via Nominatim. The photo's primary location is already pre-resolved on the user message — only call this for *other* coordinates (e.g. those returned by get_location_history). For the user's personal places (Home, Work, Cabin) prefer `get_personal_place_at`.",
serde_json::json!({
"type": "object",
"required": ["latitude","longitude"],
@@ -2642,7 +2928,7 @@ Return ONLY the summary, nothing else."#,
ifapollo_enabled{
tools.push(Tool::function(
"get_personal_place_at",
"Get the user's personal, named place (e.g. Home, Work, Cabin) at a GPS coordinate, if any. Returns the place name, category, free-text description (the user's own notes about the location), and radius. More specific than reverse_geocode — prefer this when both apply.",
"Get the user's personal, named place (e.g. Home, Work, Cabin) at a GPS coordinate, if any. Returns the place name, category, free-text description (the user's own notes about the location), and radius. More specific than reverse_geocode — prefer this when both apply. The cheap default is to call this once with the photo's GPS before any other location reasoning.",
serde_json::json!({
"type": "object",
"required": ["latitude","longitude"],
@@ -2657,7 +2943,7 @@ Return ONLY the summary, nothing else."#,
// Knowledge memory tools
tools.push(Tool::function(
"recall_entities",
"Search the knowledge memory for people, places, events, or things previously learned from other photos. Use this to retrieve context about subjects appearing in this photo.",
"Search the knowledge memory for people, places, events, or things previously learned from other photos. Provide at least one of `name` or `entity_type` — calling with neither dumps up to 50 entities ordered by id, which is rarely what you want.",
serde_json::json!({
"type": "object",
"properties": {
@@ -2668,7 +2954,7 @@ Return ONLY the summary, nothing else."#,
"entity_type": {
"type": "string",
"enum": ["person","place","event","thing"],
"description": "Filter by entity type (optional)"
"description": "Filter by entity type. Pass alone to enumerate everything of one kind (e.g. 'all known places')."
},
"limit": {
"type": "integer",
@@ -2693,9 +2979,80 @@ Return ONLY the summary, nothing else."#,
}),
));
tools.push(Tool::function(
"recall_facts_for_entity",
"Retrieve all stored facts about one entity (person, place, event, thing) without needing a photo path. Use in chat when the user asks about a known subject (e.g. 'tell me more about Sarah') and you have the entity_id from recall_entities — or pass `name` to look it up by name.",
serde_json::json!({
"type": "object",
"properties": {
"entity_id": {
"type": "integer",
"description": "The entity's ID (preferred — exact match)."
},
"name": {
"type": "string",
"description": "Entity name (case-insensitive). Resolves to the highest-confidence active entity with that name. Provide this OR entity_id."
},
"entity_type": {
"type": "string",
"enum": ["person","place","event","thing"],
"description": "Optional filter when looking up by name (e.g. disambiguate a person named 'Tahoe' from the place)."
}
}
}),
));
tools.push(Tool::function(
"find_photos_with_entity",
"List photos that have been linked to an entity in the knowledge memory. Use to answer 'when did I last see Sarah' / 'show me photos from the Tahoe trip'. Returns file paths and the role each entity played in the photo (subject / background / location).",
serde_json::json!({
"type": "object",
"properties": {
"entity_id": {
"type": "integer",
"description": "The entity's ID (preferred — exact match)."
},
"name": {
"type": "string",
"description": "Entity name (case-insensitive). Provide this OR entity_id."
},
"entity_type": {
"type": "string",
"enum": ["person","place","event","thing"],
"description": "Optional filter when looking up by name."
},
"limit": {
"type": "integer",
"description": "Maximum number of photo paths to return (default: 20, max: 50)"
}
}
}),
));
tools.push(Tool::function(
"get_exif",
"Read the stored EXIF row for a photo file path. Returns camera make/model, lens, focal length, aperture, shutter speed, ISO, dimensions, and the date_taken source. Use to answer photography questions ('what camera was this on?') or to inspect technical metadata. The current photo's GPS, date, and date source are already on the user message — only call this when you need the additional fields.",
serde_json::json!({
"type": "object",
"required": ["file_path"],
"properties": {
"file_path": {
"type": "string",
"description": "The file path of the photo (defaults to the current photo if you pass its path)."
}
}
}),
));
// Knowledge-memory writes are gated by `disable_writes` — when set,
// exploration / chat continuation can run without polluting the
// persistent KB with one-off variants (e.g. caption-style prompts).
if!disable_writes{
tools.push(Tool::function(
"store_entity",
"Store or update a person, place, event, or thing in the knowledge memory. Call this when you identify a subject in this photo that should be remembered for future insights.",
"Store or update a person, place, event, or thing in the knowledge memory. \
Returns the entity's ID. If similarly-named entities already exist, the \
response lists them — prefer using an existing ID over creating a duplicate.",
serde_json::json!({
"type": "object",
"required": ["name","entity_type"],
@@ -2719,7 +3076,12 @@ Return ONLY the summary, nothing else."#,
tools.push(Tool::function(
"store_fact",
"Record a fact about an entity in the knowledge memory. Provide EITHER object_entity_id (when the object is a known entity whose ID you have) OR object_value (for free-text attributes). The fact will be linked to the current photo automatically.",
"Record a fact about an entity in the knowledge memory. Provide EITHER object_entity_id \
(when the object is a known entity whose ID you have) OR object_value (for free-text \
attributes). The fact is linked to the current photo automatically. Predicates use \
@@ -2747,10 +3109,11 @@ Return ONLY the summary, nothing else."#,
}
}),
));
}
tools.push(Tool::function(
"get_current_datetime",
"Get the current date and time. Useful for understanding how long ago the photo was taken.",
"Returns the current date and time. Use ONLY when you need to compute time-since-photo for phrases like 'two years ago' — the photo's date is already in the user message and re-deriving it is wasted budget.",
serde_json::json!({
"type": "object",
"properties": {}
@@ -2957,6 +3320,7 @@ Return ONLY the summary, nothing else."#,
backend: Option<String>,
fewshot_examples: Vec<Vec<ChatMessage>>,
fewshot_source_ids: Vec<i32>,
disable_writes: bool,
)-> Result<(Option<i32>,Option<i32>)>{
lettracer=global_tracer();
letcurrent_cx=opentelemetry::Context::current();
@@ -3191,9 +3555,40 @@ Return ONLY the summary, nothing else."#,
.map(|dt|dt.date_naive())
.unwrap_or_else(||Utc::now().date_naive());
// Date confidence comes from the canonical-date waterfall — one of
// exif/exiftool/filename/fs_time/manual. Surface in the user message
// so the model can hedge appropriately on `fs_time`-sourced dates.
// The knowledge-write line gets rewritten when disable_writes is on
// so the model isn't told to call tools that aren't on the menu.
letknowledge_write_line=ifdisable_writes{
"Knowledge-memory writes are disabled for this run — do not attempt to call store_entity or store_fact (they are not available)."
}else{
"When you identify people, places, events, or notable things in this photo: use store_entity to record them and store_fact to record key facts (relationships, roles, attributes). This builds a persistent memory for future insights."
};
letbase_system=format!(
"You are a personal photo memory assistant helping to reconstruct a memory from a photo.{owner_id_note}\n\n\
{fewshot_block}\
IMPORTANT INSTRUCTIONS:\n\
1. You MUST call multiple tools to gather context BEFORE writing any final insight. Do not produce a final answer after only one or two tool calls.\n\
2. When calling get_sms_messages and search_rag, always make at least one call WITHOUT a contact filter to capture what else was happening in {owner_name}'s life around this date — other conversations, events, and activities provide important wider context even when a specific contact is known.\n\
3. Use recall_facts_for_photo to load any previously stored knowledge about subjects in this photo.\n\
4. Use recall_entities to look up known people, places, or things that appear in this photo.\n\
5. When you identify people, places, events, or notable things in this photo: use store_entity to record them and store_fact to record key facts (relationships, roles, attributes). This builds a persistent memory for future insights.\n\
6. Only produce your final insight AFTER you have gathered context from at least 5 tool calls.\n\
7. If a tool returns no results, that is useful information — continue calling the remaining tools anyway.\n\
8. You have a hard budget of {max_iterations} tool-calling iterations before the loop ends. Plan your context gathering so you can write a complete final insight within that budget.",
## Tool-use guidance\n\
- You have up to {max_iterations} iterations available. Aim to use at least 5 of them on context-gathering tools before writing — only skip context-gathering when the photo is genuinely trivial (e.g. a screenshot of a receipt). Reserve your last 1–2 iterations for writing the final insight.\n\
- When you call get_sms_messages or search_rag and you know the contact, also make at least one call WITHOUT a contact filter so you can see what else was happening in {owner_name}'s life around this date.\n\
- Use recall_facts_for_photo and recall_entities early to load any prior knowledge about subjects in this photo.\n\
- {knowledge_write_line}\n\
- If a tool returns no results, that's useful information — pivot to a different tool, don't retry the same call.",
owner_id_note=owner_id_note,
fewshot_block=fewshot_block,
owner_name=owner_name,
knowledge_write_line=knowledge_write_line,
max_iterations=max_iterations
);
// Custom prompts are *appended* under an explicit override heading so
// they actually beat the base instructions. Prepending was the wrong
// default — later instructions tend to win attention.
"{visual_block}Please analyze this photo and gather any relevant context from the surrounding weeks.\n\n\
Photo file path: {}\n\
Date taken: {}\n\
{}\n\
{}\n\
{}\n\n\
Use the available tools to gather more context about this moment (messages, calendar events, location history, etc.), \
then write a detailed insight with a title and summary.",
file_path,
date_taken.format("%B %d, %Y"),
contact_info,
gps_info,
tags_info,
visual_block=visual_block,
"{photo_block}\n\n\
## What to do\n\n\
1. First, call recall_facts_for_photo and recall_entities to load any prior knowledge about subjects in this photo.\n\
2. Then call at least 3 of: search_rag, get_sms_messages (try once with the contact filter and once without), get_calendar_events, get_location_history — pick the ones most relevant to this photo's date and context.\n\
3. Only after you have tool results, write the final insight with a title and a detailed summary that references specific facts from the metadata above and from your tool results. Generic narration is not acceptable.\n\n\
YOUR FIRST RESPONSE MUST BE A TOOL CALL. Do not output any final answer text until you have called at least 5 tools."
);
// 10. Define tools. Hybrid mode omits `describe_photo` since the
// chat model receives the visual description inline.
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.