insight-chat: design for tool catalog, system prompt, and SMS fixes
Lays out the cycle: split generation system prompt into identity vs procedural blocks so personas drive voice/shape, add per-turn system_prompt override on chat (ephemeral in append mode, persisted on amend), gate optional tools on data presence, and fix the days_radius bug in get_sms_messages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
392
specs/002-insight-chat-improvements.md
Normal file
392
specs/002-insight-chat-improvements.md
Normal file
@@ -0,0 +1,392 @@
|
|||||||
|
# Insight Chat improvements — design
|
||||||
|
|
||||||
|
**Date:** 2026-05-07
|
||||||
|
**Branch:** `feature/insight-chat-improvements` (in both `ImageApi/` and `FileViewer-React/`)
|
||||||
|
**Scope:** ImageApi photo-anchored insight + chat surface, plus the
|
||||||
|
FileViewer-React client. Apollo's free/visit chat is **not** in this cycle.
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
Three concrete gaps in today's insight + chat surface:
|
||||||
|
|
||||||
|
1. **Tool drift.** ImageApi exposes 13 tools to the LLM. Some are gated on
|
||||||
|
`apollo_enabled` / `has_vision`, but several optional ones
|
||||||
|
(`search_rag`, `get_calendar_events`, `get_location_history`) are
|
||||||
|
registered unconditionally even when their backing tables are empty.
|
||||||
|
Descriptions vary in quality and a couple have outright bugs.
|
||||||
|
2. **Inconsistent / incomplete tool descriptions.** Tools like
|
||||||
|
`search_messages` describe their selection rules but omit useful
|
||||||
|
examples; `store_fact` doesn't show the `object_entity_id` vs
|
||||||
|
`object_value` choice; `get_sms_messages` accepts a `days_radius`
|
||||||
|
parameter that the backing client silently ignores. The LLM is being
|
||||||
|
instructed against a slightly wrong reality.
|
||||||
|
3. **System prompt fights the persona.** Today's generation prompt
|
||||||
|
prepends the user's `custom_system_prompt` and then immediately asserts
|
||||||
|
`"You are a personal photo memory assistant..."`. The user message
|
||||||
|
demands `"a detailed insight with a title and summary"`. Both
|
||||||
|
contradict whatever voice / shape / POV the persona just established.
|
||||||
|
On chat continuation the persona is baked into the stored transcript at
|
||||||
|
generation time and can't be changed without regenerating.
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
- Tool catalog is **representative** — every tool registered for a turn is
|
||||||
|
backed by data the user actually has.
|
||||||
|
- Tool descriptions are **concise but complete**, with examples for any
|
||||||
|
tool whose param choice has multiple modes or non-obvious interactions.
|
||||||
|
- Persona / system prompt is **authoritative** for voice, length, and
|
||||||
|
shape — both at generation and during chat continuation.
|
||||||
|
- Per-turn system prompt overrides on chat work without surprising
|
||||||
|
side-effects on the stored transcript outside `amend` mode.
|
||||||
|
|
||||||
|
## Non-goals
|
||||||
|
|
||||||
|
- Apollo backend / frontend changes. Separate cycle.
|
||||||
|
- Refactoring the `generate_photo_title` post-hoc title flow. Already
|
||||||
|
takes `custom_system_prompt`.
|
||||||
|
- Tool consolidation (e.g. merging `search_messages` + `get_sms_messages`).
|
||||||
|
Considered and deferred — keeps blast radius small.
|
||||||
|
- Removing knowledge-memory tools (`recall_*` / `store_*`). Audit
|
||||||
|
confirmed they have a live read path via `knowledge.rs` HTTP routes.
|
||||||
|
- Persisting persona changes to the stored transcript outside `amend`
|
||||||
|
mode. Deliberate — re-opens use the persona currently active in the
|
||||||
|
client, not a sticky historical setting.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Design
|
||||||
|
|
||||||
|
### A. System prompt — generation
|
||||||
|
|
||||||
|
Today (`insight_generator.rs:3305–3326`):
|
||||||
|
|
||||||
|
```
|
||||||
|
[custom_system_prompt if any] +
|
||||||
|
"You are a personal photo memory assistant helping to reconstruct..." +
|
||||||
|
{owner_id_note} +
|
||||||
|
{fewshot_block} +
|
||||||
|
"IMPORTANT INSTRUCTIONS:
|
||||||
|
1. You MUST call multiple tools...
|
||||||
|
2. When calling get_sms_messages and search_rag...
|
||||||
|
3. Use recall_facts_for_photo...
|
||||||
|
...
|
||||||
|
8. You have a hard budget of {max_iterations} iterations..."
|
||||||
|
```
|
||||||
|
|
||||||
|
The first concatenation is the bug: `custom` claims one identity, the
|
||||||
|
next line asserts another.
|
||||||
|
|
||||||
|
**New structure** — two named blocks, in order:
|
||||||
|
|
||||||
|
```
|
||||||
|
[Identity / voice / format block] ← persona-controlled (or neutral default)
|
||||||
|
[Procedural block] ← always identity-free
|
||||||
|
```
|
||||||
|
|
||||||
|
**Identity block:**
|
||||||
|
- When `custom_system_prompt` is supplied: use that string verbatim, no
|
||||||
|
pre/append.
|
||||||
|
- When not: a neutral default that doesn't fight a future persona.
|
||||||
|
Working text: `"You are reconstructing a memory from a photo. Use the
|
||||||
|
gathered context to write a thoughtful summary; you decide voice,
|
||||||
|
length, and shape."`
|
||||||
|
|
||||||
|
**Procedural block** — identity-free, always emitted:
|
||||||
|
|
||||||
|
```
|
||||||
|
Tool-use guidance:
|
||||||
|
- You have a budget of {max_iterations} tool-calling iterations.
|
||||||
|
- Call tools to gather context BEFORE writing your final answer; don't
|
||||||
|
answer after one or two calls.
|
||||||
|
- When calling get_sms_messages or search_rag, make at least one call
|
||||||
|
WITHOUT a contact filter — surrounding events matter even when a
|
||||||
|
contact is known.
|
||||||
|
- Use recall_facts_for_photo + recall_entities to load any prior
|
||||||
|
knowledge about subjects in the photo.
|
||||||
|
- When you identify people / places / events / things, use store_entity
|
||||||
|
+ store_fact to grow the persistent memory.
|
||||||
|
- A tool returning no results is informative; continue with the others.
|
||||||
|
|
||||||
|
{owner_id_note if applicable}
|
||||||
|
{fewshot_block if applicable}
|
||||||
|
```
|
||||||
|
|
||||||
|
Differences from today's "IMPORTANT INSTRUCTIONS" block: removed the
|
||||||
|
"you are a personal photo memory assistant" framing and the explicit
|
||||||
|
"at least 5 tool calls" floor (replaced with the softer "don't answer
|
||||||
|
after one or two"). Few-shot stays — it's pattern-of-tool-use, not
|
||||||
|
identity.
|
||||||
|
|
||||||
|
### B. User message — generation
|
||||||
|
|
||||||
|
Today (line 3357):
|
||||||
|
|
||||||
|
```
|
||||||
|
{visual_block}Please analyze this photo and gather any relevant context
|
||||||
|
from the surrounding weeks.
|
||||||
|
|
||||||
|
Photo file path: {file_path}
|
||||||
|
Date taken: {date}
|
||||||
|
{contact_info}
|
||||||
|
{gps_info}
|
||||||
|
{tags_info}
|
||||||
|
|
||||||
|
Use the available tools to gather more context about this moment
|
||||||
|
(messages, calendar events, location history, etc.), then write a
|
||||||
|
detailed insight with a title and summary.
|
||||||
|
```
|
||||||
|
|
||||||
|
Problems: the trailing line bakes in output shape ("title and
|
||||||
|
summary"), and the title from the resulting response is **discarded
|
||||||
|
anyway** — `generate_photo_title` (line 3494) regenerates the title
|
||||||
|
post-hoc from the summary. So the prompt is constraining voice for no
|
||||||
|
data-model benefit.
|
||||||
|
|
||||||
|
**New payload** — context-only, no output prescription:
|
||||||
|
|
||||||
|
```
|
||||||
|
{visual_block}Photo file path: {file_path}
|
||||||
|
Date taken: {date}
|
||||||
|
{contact_info}
|
||||||
|
{gps_info}
|
||||||
|
{tags_info}
|
||||||
|
|
||||||
|
Gather context with the available tools, then respond.
|
||||||
|
```
|
||||||
|
|
||||||
|
The persona owns shape. If a user wants "title-then-paragraph" output,
|
||||||
|
their persona prompt says so.
|
||||||
|
|
||||||
|
### C. System prompt — chat continuation
|
||||||
|
|
||||||
|
Add `system_prompt: Option<String>` to `ChatTurnRequest` (and to its
|
||||||
|
HTTP wrapper `ChatTurnHttpRequest`). It carries through both the
|
||||||
|
non-streaming `chat_turn` and the streaming `chat_turn_stream`.
|
||||||
|
|
||||||
|
**Append mode (default, `amend=false`)** — ephemeral
|
||||||
|
swap-and-restore, mirroring the existing `annotate_system_with_budget`
|
||||||
|
pattern:
|
||||||
|
|
||||||
|
1. Load stored transcript.
|
||||||
|
2. If `system_prompt` is `Some(s)`:
|
||||||
|
- If first message is a `system` role: stash original content,
|
||||||
|
replace with `s`.
|
||||||
|
- Else: prepend a synthetic ephemeral system message with `s` (note
|
||||||
|
it's synthetic so the restore step pops it rather than rewriting).
|
||||||
|
3. Run `annotate_system_with_budget` on top (existing per-turn budget
|
||||||
|
note appends to whatever's there now).
|
||||||
|
4. Run the agentic loop.
|
||||||
|
5. **Before persistence**, restore the original system content (or pop
|
||||||
|
the synthetic one). Run `restore_system_content` for the budget
|
||||||
|
annotation as today.
|
||||||
|
6. Save.
|
||||||
|
|
||||||
|
Result: the model sees the override; the stored transcript is
|
||||||
|
unchanged outside the model's actual reply.
|
||||||
|
|
||||||
|
**Amend mode (`amend=true`)**:
|
||||||
|
|
||||||
|
- If `system_prompt` is supplied: the override stays in place during
|
||||||
|
the serialization for the new insight row. The new row's
|
||||||
|
`training_messages` system message is the override. `is_current=false`
|
||||||
|
flips on prior rows as today.
|
||||||
|
- If not supplied: behaves as today (stored transcript's system message
|
||||||
|
carries forward unchanged).
|
||||||
|
|
||||||
|
### D. FileViewer-React — client wiring
|
||||||
|
|
||||||
|
`hooks/useInsightChat.tsx`:
|
||||||
|
- `SendTurnOptions` gains `systemPromptOverride?: string | null`.
|
||||||
|
- Inside `sendTurn`, before issuing the streaming POST:
|
||||||
|
1. Read the active persona's `systemPrompt` from AsyncStorage
|
||||||
|
(already loaded for generation flows — reuse the same accessor).
|
||||||
|
2. If a one-shot `systemPromptOverride` is set, append as a suffix
|
||||||
|
(`${persona}\n\n${override}`) so persona voice survives + override
|
||||||
|
tweaks the turn.
|
||||||
|
3. Include the resulting string as `system_prompt` on the request body.
|
||||||
|
- No history-load change. The history endpoint still returns the stored
|
||||||
|
transcript.
|
||||||
|
|
||||||
|
`components/InsightChatModal.tsx`:
|
||||||
|
- Add a small "Style note" composer affordance — a one-shot text input
|
||||||
|
that, when filled, becomes the `systemPromptOverride` for the next
|
||||||
|
send. Cleared after send.
|
||||||
|
- The existing persona chip continues to open `PersonaManagerModal`.
|
||||||
|
|
||||||
|
`hooks/usePersonas.tsx` and the bundled defaults:
|
||||||
|
- Built-in `assistant` and `journal` prompts get audited and rewritten
|
||||||
|
to **explicitly state voice / shape / length** — since the framework
|
||||||
|
no longer guarantees a default shape, the persona must.
|
||||||
|
|
||||||
|
### E. Tool catalog — gating
|
||||||
|
|
||||||
|
Widen `build_tool_definitions` from `(has_vision: bool, apollo_enabled:
|
||||||
|
bool)` to a single `ToolGateOpts` struct:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct ToolGateOpts {
|
||||||
|
pub has_vision: bool,
|
||||||
|
pub apollo_enabled: bool,
|
||||||
|
pub daily_summaries_present: bool,
|
||||||
|
pub calendar_present: bool,
|
||||||
|
pub location_history_present: bool,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The chat / generation services compute the three new fields lazily per
|
||||||
|
turn via `SELECT 1 FROM <table> LIMIT 1` (cheap; cached for the turn's
|
||||||
|
duration). Lazy because operators import data after launch and we don't
|
||||||
|
want to require a restart for the LLM to discover its new capabilities.
|
||||||
|
|
||||||
|
Per-tool gating:
|
||||||
|
|
||||||
|
| Tool | Existing gate | New gate |
|
||||||
|
|---|---|---|
|
||||||
|
| `describe_photo` | `has_vision` | unchanged |
|
||||||
|
| `get_personal_place_at` | `apollo_enabled` | unchanged |
|
||||||
|
| `get_calendar_events` | none | `calendar_present` |
|
||||||
|
| `get_location_history` | none | `location_history_present` |
|
||||||
|
| `search_rag` | none | `daily_summaries_present` |
|
||||||
|
|
||||||
|
All other tools always-on. (`get_sms_messages` and `search_messages`
|
||||||
|
fail informatively if SMS-API is unreachable; not worth a startup probe
|
||||||
|
since intermittent failures are the same shape.)
|
||||||
|
|
||||||
|
### F. Tool descriptions — convention
|
||||||
|
|
||||||
|
Every description follows:
|
||||||
|
|
||||||
|
1. One sentence: **what** + **when to call**.
|
||||||
|
2. Param semantics worth knowing (units, ranges, mode behavior,
|
||||||
|
precedence).
|
||||||
|
3. **Example invocation** for tools with multiple modes, optional bands,
|
||||||
|
or non-obvious parameter interactions.
|
||||||
|
4. Cross-references when relevant: `prefer X when both apply`.
|
||||||
|
|
||||||
|
Banned: all-caps section headers inside descriptions
|
||||||
|
(`"CONTENT search"`, `"TIME-BASED fetch"`); persona-prescriptive language
|
||||||
|
(`"you are a..."`); behavioral references to other tools by description
|
||||||
|
rather than name.
|
||||||
|
|
||||||
|
Tools getting examples: `search_messages`, `search_rag`, `store_fact`,
|
||||||
|
`get_sms_messages`. Trivial tools (`get_current_datetime`,
|
||||||
|
`reverse_geocode`, `get_file_tags`) skip the example.
|
||||||
|
|
||||||
|
Sample (`search_messages`):
|
||||||
|
|
||||||
|
> Search SMS/MMS message bodies. Modes: `fts5` (keyword + phrase + prefix
|
||||||
|
> + AND/OR/NOT + NEAR proximity), `semantic` (embedding similarity,
|
||||||
|
> requires generated embeddings), `hybrid` (RRF merge, recommended;
|
||||||
|
> degrades to `fts5` when embeddings absent). Optional `start_ts` /
|
||||||
|
> `end_ts` (real-UTC unix seconds) and `contact_id` filters. For pure
|
||||||
|
> date / contact browsing without keywords, prefer `get_sms_messages`.
|
||||||
|
>
|
||||||
|
> Examples:
|
||||||
|
> - `{query: "trader joe's"}` — phrase across all time.
|
||||||
|
> - `{query: "dinner", contact_id: 42, start_ts: 1700000000, end_ts: 1700604800}`
|
||||||
|
> — keyword within a contact and a week.
|
||||||
|
> - `{query: "NEAR(meeting work, 5)"}` — proximity search.
|
||||||
|
|
||||||
|
### G. SMS tool fixes
|
||||||
|
|
||||||
|
#### `get_sms_messages` — honor `days_radius`
|
||||||
|
|
||||||
|
Today: `sms_client::fetch_messages_for_contact(contact, center_ts)`
|
||||||
|
hardcodes `Duration::days(4)` (lines 31–37). The tool accepts
|
||||||
|
`days_radius` and silently ignores it.
|
||||||
|
|
||||||
|
**Fix:** widen the signature to
|
||||||
|
`fetch_messages_for_contact(contact, center_ts, days_radius)`. Tool
|
||||||
|
plumbs through. Default 4 retained for back-compat.
|
||||||
|
|
||||||
|
#### `search_messages` — add date and contact_id filters
|
||||||
|
|
||||||
|
Today: ImageApi's `search_messages` only forwards `query`, `mode`,
|
||||||
|
`limit` to SMS-API.
|
||||||
|
|
||||||
|
**Fix:** add `start_ts`, `end_ts`, `contact_id` parameters.
|
||||||
|
- `contact_id` forwards directly to SMS-API
|
||||||
|
(`/api/messages/search/?contact_id=`).
|
||||||
|
- `start_ts` / `end_ts` are not natively accepted by SMS-API's search
|
||||||
|
endpoint. Apply client-side post-filter on the response (Apollo's
|
||||||
|
pattern: `chat_tools.py:670–680`). Bump the SMS-API `limit` to a
|
||||||
|
larger fetch pool when a date filter is supplied so in-window matches
|
||||||
|
aren't lost to out-of-window FTS rank.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation sequencing
|
||||||
|
|
||||||
|
Each step is independently mergeable.
|
||||||
|
|
||||||
|
### ImageApi PRs
|
||||||
|
|
||||||
|
1. **Split system-prompt assembly + neutralize user message.** Two
|
||||||
|
named blocks; user message context-only. Default identity string
|
||||||
|
added. Tests: golden snapshots of the resulting `system_content`
|
||||||
|
with and without `custom_system_prompt`.
|
||||||
|
2. **`system_prompt` field on chat request + swap/restore + amend
|
||||||
|
persistence.** Mirrors `annotate_system_with_budget` pattern. Tests:
|
||||||
|
round-trip system content unchanged in append mode; persisted in
|
||||||
|
amend mode.
|
||||||
|
3. **`fetch_messages_for_contact` honors `days_radius`.** Tool wires
|
||||||
|
the param through. Tests: window math at the client level.
|
||||||
|
4. **`ToolGateOpts` + per-tool description rewrites.** Description
|
||||||
|
text changes are the bulk of the diff but no behavior change beyond
|
||||||
|
gating.
|
||||||
|
|
||||||
|
### FileViewer-React PR
|
||||||
|
|
||||||
|
5. **Chat hook sends `system_prompt`; modal gets style-note input;
|
||||||
|
built-in personas updated to specify shape.** The
|
||||||
|
`useInsightChat.sendTurn` call site picks up the persona and
|
||||||
|
includes it on every chat turn body. Style-note input is a one-shot
|
||||||
|
suffix.
|
||||||
|
|
||||||
|
## Testing & verification
|
||||||
|
|
||||||
|
**Automated:**
|
||||||
|
- Unit (Rust): swap-and-restore round-trip preserves stored transcript.
|
||||||
|
- Unit (Rust): amend mode persists override into new insight row.
|
||||||
|
- Unit (Rust): `fetch_messages_for_contact(days_radius=N)` produces a
|
||||||
|
window of `2N` days centered on `center_ts`.
|
||||||
|
- Unit (Rust): `build_tool_definitions(opts)` excludes gated tools when
|
||||||
|
the corresponding flag is false.
|
||||||
|
|
||||||
|
**Manual:**
|
||||||
|
- Run a chat turn against an existing insight without `system_prompt` →
|
||||||
|
output unchanged from baseline.
|
||||||
|
- Same insight, with override → output reflects new voice.
|
||||||
|
- Re-open chat → original baked persona still authoritative (override
|
||||||
|
was ephemeral).
|
||||||
|
- Regenerate an insight with the journal persona → model's voice
|
||||||
|
matches journal style; no "memory assistant" framing leaks through.
|
||||||
|
- Toggle data presence (delete a row from `calendar_events`) → tool
|
||||||
|
drops from the catalog on the next turn.
|
||||||
|
|
||||||
|
## Risks
|
||||||
|
|
||||||
|
- **Default identity wording matters.** A too-neutral default ("Use the
|
||||||
|
gathered context to write a summary") might produce flatter output
|
||||||
|
than today's "personal photo memory assistant" framing for users
|
||||||
|
who never set a persona. Mitigation: tune the default with a small
|
||||||
|
set of test photos before merging.
|
||||||
|
- **Persona-suffix style notes can contradict persona voice.** A user
|
||||||
|
who picks `journal` (first person, warm) and adds the style note
|
||||||
|
"respond in bullet points" will get a tonal collision. Acceptable —
|
||||||
|
the user expressed a per-turn intent and we honor it. Document the
|
||||||
|
composition rule in the persona-manager UI.
|
||||||
|
- **Lazy data-presence probes add a per-turn `SELECT 1`.** Negligible
|
||||||
|
on SQLite (sub-millisecond) but adds up across many turns. Cache the
|
||||||
|
result for the turn's duration; don't re-probe per-tool.
|
||||||
|
|
||||||
|
## Open questions
|
||||||
|
|
||||||
|
None blocking. Items deferred to a possible follow-up cycle:
|
||||||
|
|
||||||
|
- Apollo parity for the same per-turn override pattern (already
|
||||||
|
present; just needs RN client wiring on the photo path which is
|
||||||
|
already proxy).
|
||||||
|
- Tool consolidation (`search_messages` + `get_sms_messages` →
|
||||||
|
single `search_messages` with optional date filter, Apollo-style).
|
||||||
|
Considered and deferred — separate spec.
|
||||||
Reference in New Issue
Block a user