The mobile client's regenerate-after-failure flow sends a discard index
equal to the server's rendered count (its optimistic user bubble for the
failed turn was never persisted). find_raw_cut treated this as out of
range, surfacing as "Chat rewind failed: discard_from_rendered_index out
of range" and blocking the retry.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Few-shot injection on /insights/generate/agentic: compresses prior
training_messages into trajectory blocks (tool calls + result summaries)
and injects into the system prompt. Hardcoded default ids with optional
request override.
- New fewshot_source_ids column on photo_insights (+ migration) to track
which exemplars influenced a given row, for downstream training-set
filtering. Chat amend rows stamp None with a lineage note.
- Ollama client now remembers which server (primary/fallback) most
recently succeeded and tries it first on the next call, via a shared
Arc<AtomicBool>. Avoids re-404ing the primary on every agent iteration
when the chosen model only lives on the fallback.
- Demote noisy logs: daily_summary "Summary match" lines to debug;
inner chat_with_tools non-2xx body log from error to warn (outer
layer owns the terminal-error signal).
- Drift-guard tests for summarize_tool_result covering the success /
empty / error / unknown shape for every tool.
- Tidy: three pre-existing clippy warnings cleaned up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Inject the max-iterations budget into the agentic system prompt for
both insight generation and chat turns. Chat does this per-turn by
appending a note to the replayed system message and restoring it
before persistence so the note doesn't accumulate across turns.
- Stop deleting entity_photo_links at the start of agentic insight
generation. The clear made recall_facts_for_photo always return
empty, wasting a tool call and discarding knowledge from prior runs.
Re-linking the same entity is already an INSERT OR IGNORE no-op.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add LlmClient::chat_with_tools_stream and SSE endpoint
POST /insights/chat/stream that emits text deltas, tool_call /
tool_result pairs, truncated notice, and a terminal done frame as the
agentic loop runs.
- Ollama: parses NDJSON from /api/chat stream, accumulates content
deltas, emits Done with tool_calls from the final chunk.
- OpenRouter: parses OpenAI-compatible SSE, reassembles tool_call
argument deltas by index, asks for stream_options.include_usage.
- InsightChatService spawns the loop on a tokio task, feeds events
through an mpsc channel, persists training_messages at the end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
load_history now groups preceding tool_call + tool_result scaffolding
under each assistant reply as `tools: [{name, arguments, result}]`.
Result bodies over 2000 chars are truncated for payload size with a
`result_truncated` flag; the full value remains in training_messages.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rewind: POST /insights/chat/rewind truncates training_messages at a
given rendered index, dropping the target message plus any preceding
tool-call scaffolding. The initial user prompt is protected.
Metrics: log prompt_eval_count/duration and eval_count/duration from
every Ollama chat response, rendered as tokens + ms + tok/s.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds POST /insights/chat and GET /insights/chat/history. Replays the
stored agentic conversation through the same backend the insight was
generated with (or a per-turn override), runs a short tool-calling
loop, and persists the extended history in append or amend mode.
Backend switching: same-backend or hybrid->local replay verbatim;
local->hybrid is rejected in v1 (would require on-the-fly vision
description rewrite).
Per-(library, file) async mutex serialises concurrent turns. Soft
context budget drops oldest tool_call+result pairs when the
serialized history exceeds num_ctx - 2048 tokens.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>