When a photo exists in more than one library and the user
regenerates its insight from library A's chat, the regenerate
streams cleanly, store_insight flips library A's old row to
is_current=false, and inserts a new is_current=true row tagged
(library A, rel_path). On the next history fetch the user sees
their old transcript — the regenerate appears to vanish.
The cause: get_insight(file_path) filters on rel_path + is_current
only, so library B's untouched is_current=true row for the same
rel_path satisfies the query and gets returned by SQLite's .first()
ahead of A's new row. Because get_insight is also what
chat_turn_stream uses to decide bootstrap vs. continuation, the
next chat turn after the shadow hit also routes against the
wrong insight, so update_training_messages corrupts library B's
transcript with library A's chat.
Fix: add get_current_insight_for_library(library_id, file_path)
filtered on (library_id, rel_path, is_current=true) and route the
chat surface (load_history, chat_turn{,_stream}, rewind_history)
through it. load_history falls back to the cross-library
get_insight when the scoped lookup misses — preserves the
"scalar data merges across libraries" intent for the case where
the active library has no insight but another does. The path-only
get_insight stays for callers that don't have library context
(populate_knowledge, the photo-grid metadata fetch).
chat_history_handler stops dropping the parsed library on the
floor and threads it through. Single-library deploys see no
behaviour change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two persona-infrastructure correctness fixes that go together because
the second one (FK with CASCADE) requires the first (preventing the
persona row from being mutated out from under its facts).
1. update_persona handler refuses name/systemPrompt edits to built-ins
(409). includeAllMemories stays editable — that's a per-user
preference, not the persona's identity. Mirrors the existing
delete_persona guard. The DAO is intentionally permissive so the
guard sits at the HTTP layer; persona_dao test pins that contract.
2. Migration 2026-05-10 adds user_id to entity_facts and a composite
FK (user_id, persona_id) -> personas(user_id, persona_id) ON DELETE
CASCADE. This closes two issues at once:
- Persona orphans: deleting a custom persona used to leave its
facts dangling forever, readable only via PersonaFilter::All.
CASCADE now wipes them with the persona row.
- Multi-user fact leakage: PersonaFilter::Single("default") used
to surface every user's default-scoped facts. PersonaFilter is
now { user_id, persona_id } and all read paths
(get_facts_for_entity, list_facts, get_recent_activity) filter
on user_id first. upsert_fact's dedup key extends to user_id so
identical claims under shared persona names from different
users no longer corroborate-bump each other's confidence.
- user_id threads from Claims.sub.parse::<i32>().unwrap_or(1) at
the chat / insight handlers through ChatTurnRequest, the
streaming agentic loop, execute_tool, and into the leaf tools
(tool_store_fact, tool_recall_facts_for_photo). The ".unwrap_or(1)"
accommodates Apollo's service token whose sub is non-numeric on
legacy mints.
- Backfill picks the smallest user_id matching each legacy fact's
persona_id so the FK holds for already-stored rows.
Five new knowledge_dao tests with FK-on connection: persona scoping
isolation, All-variant union per-user, dedup not crossing users,
CASCADE delete, FK rejection of unknown personas. Plus
dao_update_does_not_block_built_ins documenting where the
HTTP-layer guard lives.
Apollo coordinates separately — the matching changes there add the
/api/personas proxy and start sending persona_id on photo-chat turns.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move personas off the mobile client into ImageApi as first-class
records, and scope entity_facts by persona so each one builds its own
voice over a shared entity graph. The new include_all_memories flag
lets a persona opt back into the full hive-mind pool for human
browsing of /knowledge/*; agentic generation always stays in-voice.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tap-Discuss-on-no-insight previously failed silently: ImageApi's
/insights/chat/stream required an existing agentic insight, errored
when missing, and emitted the failure as `event: error` — which the
frontend SSE consumer ignored (it listens for `error_message`).
This commit closes both gaps with a server-side state machine:
- /insights/chat/stream now branches on insight presence. Missing
insight (or `regenerate: true` in the body) → bootstrap path:
builds [System(req.system_prompt), User(req.user_message + image)],
runs the agentic loop, generates a title, persists a new row via
store_insight (which auto-flips priors). Existing insight →
continuation path (unchanged behaviour).
- New `regenerate: bool` request field forces bootstrap even when an
insight exists. Takes precedence over `amend`.
- `done` SSE payload field-name alignment with Apollo's frontend
convention: prompt_eval_count → prompt_tokens, eval_count →
eval_tokens, num_ctx echo added.
- `amended_insight_id` semantics broaden — now populated whenever the
turn produced a new row (bootstrap, regenerate, or amend). Existing
amend clients keep working unchanged; new clients get the new row's
id for free.
- `event: error` → `event: error_message` so frontend errors stop
silently dropping.
Refactor: extracted run_streaming_agentic_loop, build_chat_clients,
and generate_title as shared helpers between bootstrap and
continuation. Continuation path's outer logic moves to
run_continuation_streaming with no behaviour change.
Mobile-ready: any client (Apollo backend, mobile, future) sends one
request to /insights/chat/stream and gets the right path. Apollo's
proxy stays a dumb pipe.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Append mode: applied ephemerally — original system message restored
before persistence so re-opens see the baked persona. Amend mode:
override stays in place and becomes the new insight row's system
message. Pattern mirrors annotate_system_with_budget.
Adds system_prompt field on both ChatTurnHttpRequest and ChatTurnRequest;
plumbs through chat_turn and chat_turn_stream identically.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Few-shot injection on /insights/generate/agentic: compresses prior
training_messages into trajectory blocks (tool calls + result summaries)
and injects into the system prompt. Hardcoded default ids with optional
request override.
- New fewshot_source_ids column on photo_insights (+ migration) to track
which exemplars influenced a given row, for downstream training-set
filtering. Chat amend rows stamp None with a lineage note.
- Ollama client now remembers which server (primary/fallback) most
recently succeeded and tries it first on the next call, via a shared
Arc<AtomicBool>. Avoids re-404ing the primary on every agent iteration
when the chosen model only lives on the fallback.
- Demote noisy logs: daily_summary "Summary match" lines to debug;
inner chat_with_tools non-2xx body log from error to warn (outer
layer owns the terminal-error signal).
- Drift-guard tests for summarize_tool_result covering the success /
empty / error / unknown shape for every tool.
- Tidy: three pre-existing clippy warnings cleaned up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add LlmClient::chat_with_tools_stream and SSE endpoint
POST /insights/chat/stream that emits text deltas, tool_call /
tool_result pairs, truncated notice, and a terminal done frame as the
agentic loop runs.
- Ollama: parses NDJSON from /api/chat stream, accumulates content
deltas, emits Done with tool_calls from the final chunk.
- OpenRouter: parses OpenAI-compatible SSE, reassembles tool_call
argument deltas by index, asks for stream_options.include_usage.
- InsightChatService spawns the loop on a tokio task, feeds events
through an mpsc channel, persists training_messages at the end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
load_history now groups preceding tool_call + tool_result scaffolding
under each assistant reply as `tools: [{name, arguments, result}]`.
Result bodies over 2000 chars are truncated for payload size with a
`result_truncated` flag; the full value remains in training_messages.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rewind: POST /insights/chat/rewind truncates training_messages at a
given rendered index, dropping the target message plus any preceding
tool-call scaffolding. The initial user prompt is protected.
Metrics: log prompt_eval_count/duration and eval_count/duration from
every Ollama chat response, rendered as tokens + ms + tok/s.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds POST /insights/chat and GET /insights/chat/history. Replays the
stored agentic conversation through the same backend the insight was
generated with (or a per-turn override), runs a short tool-calling
loop, and persists the extended history in append or amend mode.
Backend switching: same-backend or hybrid->local replay verbatim;
local->hybrid is rejected in v1 (would require on-the-fly vision
description rewrite).
Per-(library, file) async mutex serialises concurrent turns. Soft
context budget drops oldest tool_call+result pairs when the
serialized history exceeds num_ctx - 2048 tokens.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add OPENROUTER_ALLOWED_MODELS env var and GET /insights/openrouter/models
endpoint returning the curated list verbatim. Drop the live capability
precheck in hybrid mode — trust the operator's allowlist; bad ids surface
as a chat-call error.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a `backend` column to photo_insights (default 'local', migration
2026-04-20-000000) and a corresponding optional `backend` field on the
agentic request. When a request sets backend=hybrid:
- The local Ollama vision model is called once via describe_image to
produce a text description.
- The description is inlined into the first user message as text —
no base64 image is ever sent to the chat model.
- The agentic tool-calling loop and title generation route through an
OpenRouterClient (dispatched via &dyn LlmClient), letting the user
pick any tool-capable model from OpenRouter per request.
- describe_photo is removed from the offered tools since the description
is already present.
Embeddings and vision stay on local Ollama regardless of backend.
Hybrid mode requires OPENROUTER_API_KEY; handlers return a clear error
when hybrid is requested without it, and also when the selected
OpenRouter model lacks tool-calling support.
AppState gains an optional openrouter client built from
OPENROUTER_API_KEY / OPENROUTER_BASE_URL / OPENROUTER_DEFAULT_MODEL /
OPENROUTER_EMBEDDING_MODEL / attribution headers. Default model is
anthropic/claude-sonnet-4.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Silence forward-looking dead_code on unused DAO modules, annotate
individual placeholder items, rewrite tautological assert!(true/false)
in token tests as panic! arms, and pick up fmt drift.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tags and insights now follow content across libraries via content_hash
lookups on the read path, so the same file indexed at different rel_paths
in multiple libraries shares its annotations. Recursive tag search scopes
hits to the selected library by checking each tagged rel_path against
the library's disk (with a content-hash sibling fallback so tags attached
under one library's rel_path still match a content-equivalent file in
another). The /image and /image/metadata handlers fall back across
libraries when the file isn't under the resolved one, so union-mode
search results (which carry no library attribution in the response)
still serve correctly.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Expose Ollama sampling params through the insight generation endpoints
so users can tune creativity/determinism per request. All four are
optional — omitted values fall through to the model's server-side
defaults.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Captures prompt_eval_count and eval_count from Ollama /api/chat responses
during the agentic loop and returns them in POST /insights/generate/agentic
so the frontend can display context window usage to the user.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Sanitise tool call arguments before re-sending in conversation history: non-object values (bool, string, null) that some models produce are normalised to {} to prevent Ollama 500s
- Map 'error parsing tool call' Ollama 500 to HTTP 400 with a descriptive message listing compatible models (llama3.1, llama3.2, qwen2.5, mistral-nemo)
- Add reverse_geocode tool backed by existing Nominatim helper; description hints model can chain it after get_location_history results
- Make get_sms_messages contact parameter optional (was required, forcing the model to guess); executor now passes None to fall back to all-contacts search
- Log tool result outcomes at warn level for errors/empty results, info for successes; log SMS API errors with full detail; log full request body on Ollama 500
- Strengthen system prompt to require 3-4 tool calls before final answer
- Try fallback server when checking model capabilities (primary-only check caused 500 for models only on fallback)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Register the agentic insight endpoint that validates tool-calling capability,
runs the agentic loop, and returns the stored PhotoInsightResponse. Returns 400
for unsupported models, 500 for other errors. Max iterations configurable via
AGENTIC_MAX_ITERATIONS env var (default 10).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>