Commit Graph

86 Commits

Author SHA1 Message Date
Cameron Cordes
a0ec1a5080 insight-chat: photo context belongs in system msg, not user turn
After refresh, the rendered transcript was showing two unwanted
artifacts in the initial user bubble:

  Photo file path: pics/DSC_5171.jpg
  please tell me about this photo and what was going on around it

  Please write your final answer now without calling any more tools.

Two distinct bugs:

1. Bootstrap was prepending `Photo file path: <path>` (and, in
   hybrid mode, the visual description block) into the user-turn
   content. The model needed it to call file_path-keyed tools, but
   the user could see it in their own bubble on replay.

2. The no-tools fallback ("Please write your final answer now…")
   was a synthetic user message we never stripped from history,
   so it persisted into training_messages, rendered as a second
   user bubble, AND wiped the prior tool-call accumulator inside
   load_history (user-turn handler clears pending_tools), which
   is why the tool invocations disappeared from the assistant
   bubble after refresh.

Fixes:

- New `build_bootstrap_system_message` helper composes the persona
  with a `--- PHOTO CONTEXT ---` block (path + optional visual
  description). Lives in the system message, not the user turn.
  The user's bubble shows only what they typed.
- Streaming agentic loop's no-tools fallback now records its
  insertion index and removes the synthetic user prompt from
  `messages` after the model responds. Final assistant content
  stays — it reads coherently on replay without the synthetic
  prompt above it. Applies to both bootstrap and continuation.

3 new tests cover the system-message composer (path-only, with
visual block, persona-trim). Total insight_chat unit tests: 27.
2026-05-08 11:07:03 -04:00
Cameron Cordes
24ecf2abd4 insight-chat: prepend Photo file path: <path> to bootstrap user turn
Bug: bootstrap user_content was just the user's typed message (plus
the hybrid visual description). Tools that take a file_path arg —
recall_facts_for_photo, get_file_tags, get_faces_in_photo — had no
way to learn the canonical path. Small models would invent
placeholders like "input_file_0.png" or call the tool with a name
guessed from a hidden multimodal input handle, neither of which
matched any real photo.

Fix: prepend a single-line "Photo file path: <normalized>\n\n" block
to user_content. Same shape generate_agentic_insight_for_photo
already uses for non-chat callers — kept the bootstrap minimal
(no date / GPS / tags pre-stuffing; the agentic loop can fetch
those via tools when needed).

Hybrid still injects the visual description block between the path
block and the user message; local mode just gets path + user text.
2026-05-08 10:59:35 -04:00
Cameron Cordes
a29ff406a1 insight-chat: extract bootstrap resolution helpers + unit-test them
resolve_bootstrap_system_prompt and resolve_bootstrap_backend run on
every bootstrap turn — they pick the persisted system prompt and the
chosen backend label. They were inline conditionals before; pulling
them out makes the rules testable without spinning up the full
streaming stack.

9 new tests cover:
- system prompt fallback to BOOTSTRAP_DEFAULT_SYSTEM_PROMPT for None,
  empty string, whitespace-only
- supplied non-empty prompts pass through verbatim, with interior
  newlines / spacing preserved (Apollo personas use multi-line tool
  listings)
- backend defaults to "local" for None / empty
- "local" / "hybrid" accepted case-insensitively with edge-trim
- unknown labels return a descriptive error

Total insight_chat tests: 24 (up from 15). No behaviour change.
2026-05-08 10:56:22 -04:00
Cameron Cordes
928efe49f9 insight-chat: bootstrap insight on first Discuss message + regenerate flag
Tap-Discuss-on-no-insight previously failed silently: ImageApi's
/insights/chat/stream required an existing agentic insight, errored
when missing, and emitted the failure as `event: error` — which the
frontend SSE consumer ignored (it listens for `error_message`).

This commit closes both gaps with a server-side state machine:

- /insights/chat/stream now branches on insight presence. Missing
  insight (or `regenerate: true` in the body) → bootstrap path:
  builds [System(req.system_prompt), User(req.user_message + image)],
  runs the agentic loop, generates a title, persists a new row via
  store_insight (which auto-flips priors). Existing insight →
  continuation path (unchanged behaviour).
- New `regenerate: bool` request field forces bootstrap even when an
  insight exists. Takes precedence over `amend`.
- `done` SSE payload field-name alignment with Apollo's frontend
  convention: prompt_eval_count → prompt_tokens, eval_count →
  eval_tokens, num_ctx echo added.
- `amended_insight_id` semantics broaden — now populated whenever the
  turn produced a new row (bootstrap, regenerate, or amend). Existing
  amend clients keep working unchanged; new clients get the new row's
  id for free.
- `event: error` → `event: error_message` so frontend errors stop
  silently dropping.

Refactor: extracted run_streaming_agentic_loop, build_chat_clients,
and generate_title as shared helpers between bootstrap and
continuation. Continuation path's outer logic moves to
run_continuation_streaming with no behaviour change.

Mobile-ready: any client (Apollo backend, mobile, future) sends one
request to /insights/chat/stream and gets the right path. Apollo's
proxy stays a dumb pipe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 10:41:50 -04:00
Cameron Cordes
8bd1a85070 insight-chat: cargo fmt sweep on the get_faces_in_photo additions
Single-line dao lock + reordered faces import. No logic changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:53:31 -04:00
Cameron Cordes
6f0c15d0c5 insight-chat: code-review polish on get_faces_in_photo
- Drop redundant `use anyhow::Context` inside has_any_faces (already
  imported at the module level).
- Drop dead `.unwrap_or("?")` on bound faces — the vec is filtered to
  is_some() so the fallback can never fire.
- Reorder the face_dao constructor param + initializer to match the
  struct declaration (between tag_dao and knowledge_dao). Update both
  state.rs call sites and populate_knowledge.rs to match.
- Hold face_dao lock once across the library-resolver loop instead of
  reacquiring per iteration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:48:22 -04:00
Cameron Cordes
b64a5bec28 insight-chat: add get_faces_in_photo agentic tool
The LLM had no path to see face_detections data — get_file_tags
returns user-applied tags, but a face that's been detected and bound
to a person via the embedding-cluster auto-bind path doesn't always
have a matching tag. The new tool joins face_detections with persons
by content_hash and returns bound names + bboxes, plus unidentified
faces (so smaller models can count people in the photo without
inferring from a visual description).

Gated on face_detections being non-empty via the same has_any_*
pattern as daily_summaries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:43:16 -04:00
Cameron Cordes
b42acbb3f3 fmt: cargo fmt sweep across drifted files
No behavior change — purely whitespace/line-break cleanup that had
accumulated since the last format run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 16:42:41 -04:00
Cameron Cordes
1cdc0f6eb9 insight-chat: drop the dead SmsApiClient::search_messages wrapper
The post-PR-4 delegation kept it as a convenience for callers that
don't filter by contact, but nothing actually uses it. Delete to clear
the dead_code warning. search_messages_with_contact remains as the
single entry point.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 15:10:31 -04:00
Cameron Cordes
e539c083c9 insight-chat: code-review polish on the tool-gating PR
- search_messages now delegates to search_messages_with_contact(.., None)
  so the two methods share a single HTTP path. Drops the dead-code
  warning and the ~30-line duplication.
- DailySummaryDao gains has_any_summaries (LIMIT 1 existence probe)
  used by current_gate_opts; the SELECT COUNT(*) get_total_summary_count
  added in the prior commit is removed (it had no other caller).
- current_gate_opts doc comment corrected to describe what the probes
  actually do.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 15:07:57 -04:00
Cameron Cordes
f50d32667b insight-chat: ToolGateOpts + per-tool description rewrites
Tools whose backing tables are empty (calendar, location_history,
daily_summaries) drop out of the catalog so the LLM doesn't waste
iteration budget calling them only to receive "no results found".
Vision and apollo gates already existed; this generalizes the pattern.

search_messages gains start_ts/end_ts/contact_id filters (date filter
is a client-side post-filter; SMS-API only accepts contact_id natively
on the search endpoint).

Descriptions follow a consistent convention: one sentence (what +
when), param semantics, examples for tools with non-obvious param
choices. No more all-caps headers, no more identity-prescriptive
language inside descriptions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 14:56:58 -04:00
Cameron Cordes
b02da0d0cc insight-chat: code-review polish on the days_radius fix
- Bind effective_radius once in fetch_messages_for_contact so the log
  output and window math share a single source of truth for the clamp.
- Clamp tool-supplied days_radius to [1, 30] at the tool boundary so a
  runaway LLM value can't produce a thousand-day window.
- Split the negative-input test into a real negative-input case
  alongside the zero-input case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 14:47:46 -04:00
Cameron Cordes
659e7bd973 insight-chat: get_sms_messages tool now honors days_radius
The agentic tool definition advertised a days_radius parameter but
sms_client::fetch_messages_for_contact was hardcoded to ±4 days,
silently ignoring whatever value the LLM chose. Plumb the parameter
through; default 4 retained at the tool level for back-compat.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 14:42:42 -04:00
Cameron Cordes
428f24b0f8 insight-chat: code-review polish on the chat system_prompt override
- Trim the override input once via Option::map(str::trim).filter(...).
- Use matches!() in restore_system_prompt_override's Prepended arm so
  it reads consistently with the Replaced arm.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 14:40:04 -04:00
Cameron Cordes
faa289882f insight-chat: per-turn system_prompt override on chat continuation
Append mode: applied ephemerally — original system message restored
before persistence so re-opens see the baked persona. Amend mode:
override stays in place and becomes the new insight row's system
message. Pattern mirrors annotate_system_with_budget.

Adds system_prompt field on both ChatTurnHttpRequest and ChatTurnRequest;
plumbs through chat_turn and chat_turn_stream identically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 14:34:08 -04:00
Cameron Cordes
177187f6a2 insight-chat: code-review polish on the system-prompt split
- Use Option::map instead of manual match-on-Option (drops clippy::manual_map).
- Drop redundant `max_iterations = max_iterations` from the format! call.
- Use captured identifiers consistently in the user_content format!.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 14:27:59 -04:00
Cameron Cordes
8ae4099d46 insight-chat: split generation system prompt into identity + procedural blocks
The framework no longer asserts "you are a personal photo memory
assistant" alongside a user-supplied custom_system_prompt — the
persona is the authoritative identity. The procedural block (tool-use
guidance, iteration budget) stays identity-free.

The user message also stops asking for "a detailed insight with a
title and summary" since the title is regenerated post-hoc anyway and
the wording was constraining voice for no data-model benefit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 14:20:45 -04:00
Cameron Cordes
48cac8c285 multi-library: hash-keyed tagged_photo + photo_insights with reconciliation
Branch B of the multi-library data-model rollout. tagged_photo and
photo_insights now follow the bytes (content_hash), not the path,
matching the policy pinned in CLAUDE.md "Multi-library data model".
Branch A's availability probe and EXIF scoping land first; this
branch builds on top.

Migration (2026-05-01-000000_hash_keyed_derived_data)

  Adds nullable content_hash columns to tagged_photo and photo_insights,
  with partial indexes on the non-null subset to keep the index small
  during the transitional window. The migration backfills from
  image_exif:
    * tagged_photo joins on rel_path alone (no library_id available);
    * photo_insights joins on (library_id, rel_path), unambiguous.
  Rows whose image_exif hash isn't known yet stay null and the runtime
  reconciliation pass populates them as the hash backlog drains.

Insert-time population

  TagDao::tag_file looks up image_exif.content_hash by rel_path before
  inserting; the hash is written into the new column.
  InsightDao::store_insight does the same scoped to (library_id,
  rel_path). Caller-supplied hash on InsertPhotoInsight wins; otherwise
  the DAO does the lookup. Both paths fall back to None if the hash
  isn't known yet — reconciliation backfills.

Reconciliation (database/reconcile.rs)

  Three idempotent passes the watcher runs once per tick after the
  per-library backfill loop:
    1. tagged_photo NULL hashes → populate from image_exif by rel_path.
    2. photo_insights NULL hashes → populate by (library_id, rel_path).
    3. photo_insights scalar merge — when multiple is_current rows
       share a content_hash, keep the earliest generated_at as
       current; demote the rest. Demoted rows keep their data so
       /insights/history is unaffected; only the "current" pointer
       narrows to one per hash.

  No filesystem dependency, so reconcile doesn't need the availability
  gate; runs every tick. Logs once when something changed, debug
  otherwise.

  Tags are set-valued under the policy (union on read, already
  DISTINCT in queries), so there is no analogous tag-collapse pass —
  duplicate (tag_id, content_hash) rows across libraries are
  harmless.

Read paths are unchanged in this branch — lookup_tags_batch's
existing rel_path-via-hash-sibling expansion still produces the
correct merge. A follow-up can simplify reads to use the new column
directly for performance.

Tests: 217 pass (212 pre-existing + 5 new in reconcile covering
NULL-fill, hash-not-yet-known no-op, library scoping on insights,
earliest-wins collapse, idempotency).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:52:16 +00:00
Cameron Cordes
67abd8d8ff style: cargo fmt
Pre-existing whitespace drift in test bodies, normalized by rustfmt.
No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 21:16:34 +00:00
Cameron Cordes
db9dc63e5e sqlite: enable WAL + busy_timeout in connect(); 408/413/429 transient
The DB connection helper now sets `journal_mode=WAL`, `busy_timeout=5000`,
and `synchronous=NORMAL` on every connection. 13+ DAOs each open their
own connection through this helper and share one SQLite file — without
WAL, a writer's exclusive lock blocks readers and `load_persons` racing
the face-watch write storm errored instantly with "database is locked".
GPU face inference made this visible by speeding detect ~10× and
flooding the writer side. WAL persists in the file once set so the
debug binaries that bypass connect() inherit it automatically.

Also widen face_client.rs's classifier: 408 / 413 / 429 are now Transient
instead of Permanent. These are operator-fixable proxy/infra errors;
marking them Permanent poisons every affected photo with status='failed'
and requires manual SQL to recover. Specifically, Apollo's nginx
defaulted to a 1 MB body cap and silently rejected normal-size photos
before they reached the backend — the deferred-and-retry contract is
the right behavior for that class of fault.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 18:13:15 +00:00
Cameron Cordes
f77e44b34d faces: fix PathExcluder false-positive + cover face_client/crop in tests
PathExcluder was iterating every component of the absolute path,
including the system prefix. Two of the existing memories tests had
been failing on master because tempdir() lives under /tmp on Linux
and a pattern like "tmp" then matched the system /tmp component
rather than anything the user actually asked to exclude. Phase 3's
file-watch hook will use the same code to skip @eaDir / .thumbnails
under each library's BASE_PATH, so the bug would hide every photo
on a host whose BASE_PATH passes through a directory named the same
as a user pattern.

Fix: store base in PathExcluder and strip it before scanning
components. A path that lives outside base falls through to the
no-match branch (defensive — nothing legit hits that today).

Also extracted the face_client error classification into a pure
classify_error_response(status, body) so the marker-row contract
with Apollo (422 → Permanent / 'failed', 5xx → Transient / defer)
is unit-testable without spinning up an HTTP server.

New tests:
  memories::tests::test_path_excluder_*           — 2 previously
    failing tests now pass.
  ai::face_client::tests::classify_*              — 4 cases:
    422 decode_failed → Permanent, 503 cuda_oom → Transient
    (handles both string and {code:..} detail shapes), 5xx →
    Transient + other 4xx → Permanent, unparseable HTML body still
    classifies on status.
  faces::tests::crop_*                            — 3 cases:
    invalid bbox rejected, valid bbox round-trips through JPEG
    decode, corner crop with 10% padding clamps inside source.

cargo test --lib: 165 passed / 0 failed (was 156 / 2 failed).
cargo fmt and clippy on new code clean. The remaining
sort_by clippy warnings in pre-existing files (memories.rs,
files.rs, exif.rs) are unrelated and present on master.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 18:09:44 +00:00
Cameron Cordes
860169032b faces: phase 2 — schema + manual face/person CRUD
Land the persistence model and HTTP surface for local face recognition.
Inference still lives in Apollo (Phase 1); this side adds the data home
plus every endpoint Apollo's UI and FileViewer-React will consume.

Schema (new migration 2026-04-29-000000_add_faces):
  - persons: visual identities. Optional entity_id bridges to the
    existing knowledge-graph entities table; auto-bridging is left to
    the management UI (we don't muddy LLM provenance from face rows).
    UNIQUE(name COLLATE NOCASE) so 'alice' / 'Alice' fold to one row.
  - face_detections: keyed on content_hash (cross-library dedup), with
    status='detected' carrying bbox + 512-d embedding BLOB, and
    'no_faces' / 'failed' marker rows that tell Phase 3's file watcher
    not to re-scan. Marker invariant enforced via CHECK; partial UNIQUE
    on content_hash WHERE status='no_faces' guards against double-marks.

Schema regenerated with `diesel print-schema` against a clean migration
run; joinables added for face_detections → libraries / persons and
persons → entities.

face_client.rs (sibling of apollo_client.rs):
  - reqwest multipart, 60 s timeout (CPU inference on a backlog can be
    slow; bounded threadpool on Apollo serializes calls anyway).
  - FaceDetectError::{Permanent, Transient, Disabled} — Phase 3 keys
    its marker-row decision on this. 422 → mark failed, 5xx → defer.
  - APOLLO_FACE_API_BASE_URL falls back to APOLLO_API_BASE_URL when
    unset; both unset = is_enabled() false, callers no-op.

faces.rs (DAO + handlers):
  - SqliteFaceDao implements the full FaceDao trait; person face counts
    go through sql_query because diesel's BoxedSelectStatement +
    group_by trips trait-resolver recursion.
  - merge_persons re-points face rows in a transaction, copies notes
    when target's are empty, deletes src.
  - manual POST /image/faces resolves content_hash through image_exif,
    crops the user-drawn bbox with 10% padding (detector wants context
    around ears/jaw), POSTs the crop to face_client.embed for a real
    ArcFace vector, then inserts source='manual'.
  - Cluster-suggest (Phase 6) gets its data from
    GET /faces/embeddings — base64-encoded paged BLOBs so Apollo's
    DBSCAN can stream them without ImageApi pre-aggregating.

Endpoints registered alongside add_*_services in main.rs:
  GET    /faces/stats?library=
  GET    /faces/embeddings?library=&unassigned=&limit=&offset=
  GET    /image/faces?path=&library=
  POST   /image/faces                        (manual create via embed)
  PATCH  /image/faces/{id}
  DELETE /image/faces/{id}
  GET    /persons?library=
  POST   /persons
  GET    /persons/{id}
  PATCH  /persons/{id}
  DELETE /persons/{id}?cascade=set_null|delete   (set_null default)
  POST   /persons/{id}/merge
  GET    /persons/{id}/faces?library=

The file-watch hook (Phase 3) and the rerun-on-one-photo handler
(Phase 6) live behind the FaceDao methods marked dead_code today —
they're called only when those phases land. Same shape for the trait
methods that aren't reached by Phase 2 routes.

Tests: 3 DAO unit tests cover person CRUD + case-insensitive uniqueness,
marker-row idempotency (mark_status is a no-op when any row exists),
and merge re-pointing faces.

Cargo.toml: reqwest gains the `multipart` feature.

cargo build / cargo test --lib / cargo fmt / cargo clippy --all-targets
all clean for the new code; the two pre-existing test_path_excluder
failures and the pre-existing sort_by clippy warnings are unrelated and
present on master.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 18:03:42 +00:00
Cameron Cordes
57fb0bcd3c EXIF GPS write: POST /image/exif/gps via exiftool
New endpoint accepts {path, library, latitude, longitude} and shells
out to exiftool to write GPSLatitude/GPSLongitude (with N/S, E/W refs)
into the file's EXIF in place. After the write, the handler
re-extracts EXIF and updates the image_exif row so the DB stays in
sync — the response carries the updated metadata block in one
round-trip. Falls through to store_exif if the row is missing.

`exif::write_gps` is the small helper. `-overwrite_original` so no
.orig sidecar is left behind. Validates lat/lon range + supports_exif
before spawning exiftool. Format support matches the existing read
path (JPEG / TIFF / RAW / HEIF / PNG / WebP) — videos still need a
different writer and aren't covered.

Apollo's "+ PIN" carousel button (separate commit on the Apollo side)
calls this through /api/photos/exif/gps. Drive-by: cargo fmt one-line
collapse on apollo_client.rs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 22:25:40 +00:00
Cameron Cordes
4ae7be35e9 Apollo Places: enrich insights with personal place name + notes
Optional integration with the sibling Apollo project's user-defined
Places (name + lat/lon + radius_m + description + category). When
APOLLO_API_BASE_URL is set, the per-photo location resolver folds the
most-specific containing Place into the LLM prompt's location string —
"Home (My house in Cambridge) — near Cambridge, MA" rather than the
city name alone. Smallest-radius wins; Apollo sorts server-side via
/api/places/contains, so the carousel badge in Apollo and the prompt
string here always agree.

Adds an agentic tool `get_personal_place_at(latitude, longitude)` that
the LLM can call during chat continuation. Tool description tells the
model the call returns the user's free-text notes, not just a name.
Deliberately narrow — no enumerate-all variant, lat/lon required.

Unset APOLLO_API_BASE_URL = legacy Nominatim-only path, tool is not
registered. 5 s timeout; all errors degrade silently.

Tests: 5 unit tests for compose_location_string (Apollo only, Nominatim
only, both, both-with-description, neither).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 19:11:12 +00:00
Cameron
fa21b0d73d chore(ai): disable default few-shot insight ids
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 19:12:25 -04:00
Cameron
0e55a6b125 fix(ai): treat rewind at end of history as no-op success
The mobile client's regenerate-after-failure flow sends a discard index
equal to the server's rendered count (its optimistic user bubble for the
failed turn was never persisted). find_raw_cut treated this as out of
range, surfacing as "Chat rewind failed: discard_from_rendered_index out
of range" and blocking the retry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 19:12:17 -04:00
Cameron
0ebc2e9003 feat(ai): rerank timing + think:false + OpenRouter error detail
- search_rag reranker now logs wall-clock time around the ollama.generate
  call, the candidate count / top-N going in, and the final reordering.
  The "final indices" + swap-count line is info level so it's always
  visible; detailed before/after previews stay at debug for when you want
  to inspect reranker quality.
- New OllamaClient::generate_no_think convenience that sets Ollama's
  top-level think:false on the request, plumbed through try_generate via
  a new internal generate_with_options. Used only by the reranker today;
  avoids the chain-of-thought tax on reasoning models (Qwen3/VL,
  DeepSeek-R1 distills, GPT-OSS) when the task has nothing to reason
  about. Server-side no-op on non-reasoning models.
- OpenRouter chat_with_tools "missing choices[0]" error now includes the
  actual response body — extracts structured {error: {code, message}}
  when OpenRouter surfaces it (common for upstream-provider issues like
  rate limits and content moderation), otherwise falls back to a
  truncated raw-JSON view.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 16:19:45 -04:00
Cameron
e5781325c6 fix(ai): render tool-call arguments as compact JSON in logs
Switch the "Agentic tool call" log from {:?} (Debug) to {} (Display) on
serde_json::Value. Display produces compact JSON — `{"date":"2023-08-15"}`
instead of `Object {"date": String("2023-08-15")}` — which is what the
model actually sent and what a human reading the log wants to see.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:25:53 -04:00
Cameron
f0ae9f95dc feat(ai): few-shot exemplars + sticky Ollama preference
- Few-shot injection on /insights/generate/agentic: compresses prior
  training_messages into trajectory blocks (tool calls + result summaries)
  and injects into the system prompt. Hardcoded default ids with optional
  request override.
- New fewshot_source_ids column on photo_insights (+ migration) to track
  which exemplars influenced a given row, for downstream training-set
  filtering. Chat amend rows stamp None with a lineage note.
- Ollama client now remembers which server (primary/fallback) most
  recently succeeded and tries it first on the next call, via a shared
  Arc<AtomicBool>. Avoids re-404ing the primary on every agent iteration
  when the chosen model only lives on the fallback.
- Demote noisy logs: daily_summary "Summary match" lines to debug;
  inner chat_with_tools non-2xx body log from error to warn (outer
  layer owns the terminal-error signal).
- Drift-guard tests for summarize_tool_result covering the success /
  empty / error / unknown shape for every tool.
- Tidy: three pre-existing clippy warnings cleaned up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:54:06 -04:00
Cameron
dc2a96162e fix(dates): prefer earliest of fs created/modified as fallback
On copied or restored files (e.g. a backup library), the OS stamps
created at copy time while modified is preserved from the source, so
the earlier of the two is a better proxy for when the content
originated. Adds utils::earliest_fs_time and threads it through the
three spots that fall back to filesystem dates: photos-list sort,
memories grouping, and insight-generation timestamp.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 17:20:12 -04:00
Cameron
d54419e779 style: cargo fmt drift
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 17:19:59 -04:00
Cameron
aa651d1c7b feat(ai): iteration budget in prompt + preserve photo-knowledge links
- Inject the max-iterations budget into the agentic system prompt for
  both insight generation and chat turns. Chat does this per-turn by
  appending a note to the replayed system message and restoring it
  before persistence so the note doesn't accumulate across turns.
- Stop deleting entity_photo_links at the start of agentic insight
  generation. The clear made recall_facts_for_photo always return
  empty, wasting a tool call and discarding knowledge from prior runs.
  Re-linking the same entity is already an INSERT OR IGNORE no-op.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 16:28:48 -04:00
Cameron
6831f50993 feat(ai): USER_NAME env + shared summary prompt + test-bin knobs
Introduces USER_NAME (default "Me") as the single source for the message
sender label and the first-person persona across daily summaries, SMS
context, insight generation, and chat. Eliminates the "Me:" transcript /
"what I did" ambiguity that confused smaller models, and unhardcodes
"Cameron" from prompt text + the knowledge-graph owner entity. Set
USER_NAME=Cameron in .env to preserve the existing owner entity row
(keyed on UNIQUE(name, entity_type)) — otherwise the next run creates
a fresh owner entity and orphans the existing facts/photo-links.

Also:
- search_messages redirect: when the model calls it with date/contact
  but no query, return a hint pointing at get_sms_messages instead of
  a bare missing-parameter error (prevents same-turn retry loops)
- sharpen search_messages vs get_sms_messages tool descriptions so
  content-vs-time-based intent is unambiguous
- extract build_daily_summary_prompt (+ DAILY_SUMMARY_MESSAGE_LIMIT,
  DAILY_SUMMARY_SYSTEM_PROMPT) shared by daily_summary_job and
  test_daily_summary binary — prompt tweaks now land in both
- EMBEDDING_MODEL const; fixes both insert sites that stored
  "mxbai-embed-large:335m" while generate_embeddings actually runs
  "nomic-embed-text:v1.5"
- test_daily_summary: add --num-ctx / --temperature / --top-p /
  --top-k / --min-p flags wired into OllamaClient setters, and print
  the configured knobs at the top of each run
- OllamaClient::generate now logs prompt/gen token counts and tok/s
  via log_chat_metrics (symmetric with chat_with_tools)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 23:39:37 -04:00
Cameron
e4a3536f87 feat(ai): search_messages tool + RAG reranker
Adds a search_messages tool that hits the Django FTS5/semantic/hybrid
endpoint for keyword-quality text search over message bodies, and an
LLM-based reranker inside tool_search_rag (gated by SEARCH_RAG_RERANK,
default on). Reranker pulls ~3x candidates from the vector index, asks
the chat model to rank by relevance, and falls back to vector order on
parse failure.

The reranker shares the active chat turn's OllamaClient so num_ctx and
sampling match — otherwise Ollama unloads/reloads the model on every
rerank call. (Unverified end-to-end; caught by inspection, awaiting
e2e confirmation.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 10:56:03 -04:00
Cameron
079cd4c5b9 feat(ai): streaming chat endpoint with live tool events
Add LlmClient::chat_with_tools_stream and SSE endpoint
POST /insights/chat/stream that emits text deltas, tool_call /
tool_result pairs, truncated notice, and a terminal done frame as the
agentic loop runs.

- Ollama: parses NDJSON from /api/chat stream, accumulates content
  deltas, emits Done with tool_calls from the final chunk.
- OpenRouter: parses OpenAI-compatible SSE, reassembles tool_call
  argument deltas by index, asks for stream_options.include_usage.
- InsightChatService spawns the loop on a tokio task, feeds events
  through an mpsc channel, persists training_messages at the end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 16:57:41 -04:00
Cameron
c2bd3c08e1 feat(ai): surface tool invocations in chat history
load_history now groups preceding tool_call + tool_result scaffolding
under each assistant reply as `tools: [{name, arguments, result}]`.
Result bodies over 2000 chars are truncated for payload size with a
`result_truncated` flag; the full value remains in training_messages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 16:03:53 -04:00
Cameron
65ab10e9a8 feat(ai): chat rewind + ollama metrics logging
Rewind: POST /insights/chat/rewind truncates training_messages at a
given rendered index, dropping the target message plus any preceding
tool-call scaffolding. The initial user prompt is protected.

Metrics: log prompt_eval_count/duration and eval_count/duration from
every Ollama chat response, rendered as tokens + ms + tok/s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 15:16:32 -04:00
Cameron
0b9528f61e feat(ai): chat continuation for photo insights (server v1)
Adds POST /insights/chat and GET /insights/chat/history. Replays the
stored agentic conversation through the same backend the insight was
generated with (or a per-turn override), runs a short tool-calling
loop, and persists the extended history in append or amend mode.

Backend switching: same-backend or hybrid->local replay verbatim;
local->hybrid is rejected in v1 (would require on-the-fly vision
description rewrite).

Per-(library, file) async mutex serialises concurrent turns. Soft
context budget drops oldest tool_call+result pairs when the
serialized history exceeds num_ctx - 2048 tokens.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 13:00:27 -04:00
Cameron
e2eefbd156 feat(ai): curated OpenRouter model picker for hybrid backend
Add OPENROUTER_ALLOWED_MODELS env var and GET /insights/openrouter/models
endpoint returning the curated list verbatim. Drop the live capability
precheck in hybrid mode — trust the operator's allowlist; bad ids surface
as a chat-call error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:36:19 -04:00
Cameron
3ac0cd62eb feat(ai): hybrid backend mode for agentic insights
Adds a `backend` column to photo_insights (default 'local', migration
2026-04-20-000000) and a corresponding optional `backend` field on the
agentic request. When a request sets backend=hybrid:

- The local Ollama vision model is called once via describe_image to
  produce a text description.
- The description is inlined into the first user message as text —
  no base64 image is ever sent to the chat model.
- The agentic tool-calling loop and title generation route through an
  OpenRouterClient (dispatched via &dyn LlmClient), letting the user
  pick any tool-capable model from OpenRouter per request.
- describe_photo is removed from the offered tools since the description
  is already present.

Embeddings and vision stay on local Ollama regardless of backend.
Hybrid mode requires OPENROUTER_API_KEY; handlers return a clear error
when hybrid is requested without it, and also when the selected
OpenRouter model lacks tool-calling support.

AppState gains an optional openrouter client built from
OPENROUTER_API_KEY / OPENROUTER_BASE_URL / OPENROUTER_DEFAULT_MODEL /
OPENROUTER_EMBEDDING_MODEL / attribution headers. Default model is
anthropic/claude-sonnet-4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 22:30:40 -04:00
Cameron
e799ba716c feat(ai): add OpenRouterClient implementing LlmClient
OpenAI-compatible client for OpenRouter. Translates canonical wire shapes at
the boundary: tool-call arguments stringify on send / parse on receive
(accepting both string and native-object forms); images rewritten from the
base64 images field into content-parts with image_url entries; role=tool
messages inherit tool_call_id from the preceding assistant's tool calls.

/models parsed into ModelCapabilities via supported_parameters (tool use)
and architecture.input_modalities (vision). 15-minute capabilities cache.
Bearer auth; HTTP-Referer / X-Title attribution headers optional.

Not wired into request routing yet — first consumer arrives with hybrid
backend mode. 11 unit tests cover the translation helpers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 22:18:29 -04:00
Cameron
0073409b3d refactor: introduce LlmClient trait (no-op)
Preparation for a second LLM backend (OpenRouter) and hybrid vision-local /
chat-remote mode. Shared wire types (ChatMessage, Tool, ToolCall, etc.) move
into a new src/ai/llm_client.rs and are re-exported from ollama.rs so
existing imports keep working. OllamaClient now implements LlmClient.

No behavior change; callers still hold the concrete OllamaClient. Caller
migration to Arc<dyn LlmClient> is deferred to the PR that wires hybrid
backend routing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 22:11:05 -04:00
Cameron
bffe604527 Remove potentially confusing TZ from insight generator 2026-04-21 01:55:07 +00:00
Cameron
a35b45fd36 feat: expand insight tool result caps and render timestamps in local time
Doubled default row caps for search_rag/get_sms_messages/get_calendar_events/recall_entities and exposed an optional `limit` parameter on each so the agent can tune per call. Render all LLM-facing timestamps as server-local time with explicit offset so smaller models stop misreading UTC as wall-clock time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
c2ee3996be chore: apply cargo fmt + clippy cleanup across crate
Silence forward-looking dead_code on unused DAO modules, annotate
individual placeholder items, rewrite tautological assert!(true/false)
in token tests as panic! arms, and pick up fmt drift.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
a0f3bfab5f fix: validate gps-summary path against every library
The /photos/gps-summary handler validated the incoming path against
the primary library's root with new_file=false, which requires the
path to exist on disk. For a viewer opened on a file from a
non-primary library, tapping the GPS link produced activePath =
<folder from lib 2>, the primary-only check failed, and the server
400'd — so the map came up empty.

Validation here is purely a traversal guard (the DAO does a prefix
LIKE against rel_path), so we now accept the path as long as any
configured library can resolve it without escaping its root.

Also applies cargo fmt drift on files touched this session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
e6ee38edec fix: resolve media across libraries for video, metadata, and insights
The /video/generate and /image/metadata handlers assumed files live under
the resolved library only, which broke when a mobile client passed no
library (union mode) but the file lived in a non-primary library. Both
now fall back to scanning every configured library for an existing file.

InsightGenerator held a single base_path, so vision-model loads and
filename-date fallbacks failed for non-primary libraries. It now takes
Vec<Library> and probes each root in resolve_full_path.

/image/metadata responses now carry library_id/library_name so the
mobile viewer can surface which library a file belongs to.

Thumbnail generation at startup is now spawned on a background thread
so the HTTP server can accept traffic while large libraries backfill.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
2d942a9926 feat: content-hash-aware tag/insight sharing + library scoping
Tags and insights now follow content across libraries via content_hash
lookups on the read path, so the same file indexed at different rel_paths
in multiple libraries shares its annotations. Recursive tag search scopes
hits to the selected library by checking each tagged rel_path against
the library's disk (with a content-hash sibling fallback so tags attached
under one library's rel_path still match a content-equivalent file in
another). The /image and /image/metadata handlers fall back across
libraries when the file isn't under the resolved one, so union-mode
search results (which carry no library attribution in the response)
still serve correctly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
ffcddbb843 feat: multi-library foundation (schema + libraries module)
Adds a `libraries` registry table and threads library_id through
per-instance metadata tables (image_exif, photo_insights,
entity_photo_links, video_preview_clips). File-path columns renamed to
rel_path to make the relative-to-root semantics explicit. Adds
content_hash + size_bytes on image_exif to support future hash-keyed
thumbnail/HLS dedup. Tags and favorites stay library-agnostic so they
share across libraries by rel_path.

Behavior is unchanged: a single primary library (id=1) is seeded from
BASE_PATH on first boot; all handlers and DAOs route through it as a
transitional shim until the API gains a library query param.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
8bc948b297 Insight prompt tweaks 2026-04-17 11:55:33 -04:00