Commit Graph

19 Commits

Author SHA1 Message Date
Cameron Cordes
860169032b faces: phase 2 — schema + manual face/person CRUD
Land the persistence model and HTTP surface for local face recognition.
Inference still lives in Apollo (Phase 1); this side adds the data home
plus every endpoint Apollo's UI and FileViewer-React will consume.

Schema (new migration 2026-04-29-000000_add_faces):
  - persons: visual identities. Optional entity_id bridges to the
    existing knowledge-graph entities table; auto-bridging is left to
    the management UI (we don't muddy LLM provenance from face rows).
    UNIQUE(name COLLATE NOCASE) so 'alice' / 'Alice' fold to one row.
  - face_detections: keyed on content_hash (cross-library dedup), with
    status='detected' carrying bbox + 512-d embedding BLOB, and
    'no_faces' / 'failed' marker rows that tell Phase 3's file watcher
    not to re-scan. Marker invariant enforced via CHECK; partial UNIQUE
    on content_hash WHERE status='no_faces' guards against double-marks.

Schema regenerated with `diesel print-schema` against a clean migration
run; joinables added for face_detections → libraries / persons and
persons → entities.

face_client.rs (sibling of apollo_client.rs):
  - reqwest multipart, 60 s timeout (CPU inference on a backlog can be
    slow; bounded threadpool on Apollo serializes calls anyway).
  - FaceDetectError::{Permanent, Transient, Disabled} — Phase 3 keys
    its marker-row decision on this. 422 → mark failed, 5xx → defer.
  - APOLLO_FACE_API_BASE_URL falls back to APOLLO_API_BASE_URL when
    unset; both unset = is_enabled() false, callers no-op.

faces.rs (DAO + handlers):
  - SqliteFaceDao implements the full FaceDao trait; person face counts
    go through sql_query because diesel's BoxedSelectStatement +
    group_by trips trait-resolver recursion.
  - merge_persons re-points face rows in a transaction, copies notes
    when target's are empty, deletes src.
  - manual POST /image/faces resolves content_hash through image_exif,
    crops the user-drawn bbox with 10% padding (detector wants context
    around ears/jaw), POSTs the crop to face_client.embed for a real
    ArcFace vector, then inserts source='manual'.
  - Cluster-suggest (Phase 6) gets its data from
    GET /faces/embeddings — base64-encoded paged BLOBs so Apollo's
    DBSCAN can stream them without ImageApi pre-aggregating.

Endpoints registered alongside add_*_services in main.rs:
  GET    /faces/stats?library=
  GET    /faces/embeddings?library=&unassigned=&limit=&offset=
  GET    /image/faces?path=&library=
  POST   /image/faces                        (manual create via embed)
  PATCH  /image/faces/{id}
  DELETE /image/faces/{id}
  GET    /persons?library=
  POST   /persons
  GET    /persons/{id}
  PATCH  /persons/{id}
  DELETE /persons/{id}?cascade=set_null|delete   (set_null default)
  POST   /persons/{id}/merge
  GET    /persons/{id}/faces?library=

The file-watch hook (Phase 3) and the rerun-on-one-photo handler
(Phase 6) live behind the FaceDao methods marked dead_code today —
they're called only when those phases land. Same shape for the trait
methods that aren't reached by Phase 2 routes.

Tests: 3 DAO unit tests cover person CRUD + case-insensitive uniqueness,
marker-row idempotency (mark_status is a no-op when any row exists),
and merge re-pointing faces.

Cargo.toml: reqwest gains the `multipart` feature.

cargo build / cargo test --lib / cargo fmt / cargo clippy --all-targets
all clean for the new code; the two pre-existing test_path_excluder
failures and the pre-existing sort_by clippy warnings are unrelated and
present on master.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 18:03:42 +00:00
Cameron Cordes
4ae7be35e9 Apollo Places: enrich insights with personal place name + notes
Optional integration with the sibling Apollo project's user-defined
Places (name + lat/lon + radius_m + description + category). When
APOLLO_API_BASE_URL is set, the per-photo location resolver folds the
most-specific containing Place into the LLM prompt's location string —
"Home (My house in Cambridge) — near Cambridge, MA" rather than the
city name alone. Smallest-radius wins; Apollo sorts server-side via
/api/places/contains, so the carousel badge in Apollo and the prompt
string here always agree.

Adds an agentic tool `get_personal_place_at(latitude, longitude)` that
the LLM can call during chat continuation. Tool description tells the
model the call returns the user's free-text notes, not just a name.
Deliberately narrow — no enumerate-all variant, lat/lon required.

Unset APOLLO_API_BASE_URL = legacy Nominatim-only path, tool is not
registered. 5 s timeout; all errors degrade silently.

Tests: 5 unit tests for compose_location_string (Apollo only, Nominatim
only, both, both-with-description, neither).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 19:11:12 +00:00
Cameron
6831f50993 feat(ai): USER_NAME env + shared summary prompt + test-bin knobs
Introduces USER_NAME (default "Me") as the single source for the message
sender label and the first-person persona across daily summaries, SMS
context, insight generation, and chat. Eliminates the "Me:" transcript /
"what I did" ambiguity that confused smaller models, and unhardcodes
"Cameron" from prompt text + the knowledge-graph owner entity. Set
USER_NAME=Cameron in .env to preserve the existing owner entity row
(keyed on UNIQUE(name, entity_type)) — otherwise the next run creates
a fresh owner entity and orphans the existing facts/photo-links.

Also:
- search_messages redirect: when the model calls it with date/contact
  but no query, return a hint pointing at get_sms_messages instead of
  a bare missing-parameter error (prevents same-turn retry loops)
- sharpen search_messages vs get_sms_messages tool descriptions so
  content-vs-time-based intent is unambiguous
- extract build_daily_summary_prompt (+ DAILY_SUMMARY_MESSAGE_LIMIT,
  DAILY_SUMMARY_SYSTEM_PROMPT) shared by daily_summary_job and
  test_daily_summary binary — prompt tweaks now land in both
- EMBEDDING_MODEL const; fixes both insert sites that stored
  "mxbai-embed-large:335m" while generate_embeddings actually runs
  "nomic-embed-text:v1.5"
- test_daily_summary: add --num-ctx / --temperature / --top-p /
  --top-k / --min-p flags wired into OllamaClient setters, and print
  the configured knobs at the top of each run
- OllamaClient::generate now logs prompt/gen token counts and tok/s
  via log_chat_metrics (symmetric with chat_with_tools)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 23:39:37 -04:00
Cameron
079cd4c5b9 feat(ai): streaming chat endpoint with live tool events
Add LlmClient::chat_with_tools_stream and SSE endpoint
POST /insights/chat/stream that emits text deltas, tool_call /
tool_result pairs, truncated notice, and a terminal done frame as the
agentic loop runs.

- Ollama: parses NDJSON from /api/chat stream, accumulates content
  deltas, emits Done with tool_calls from the final chunk.
- OpenRouter: parses OpenAI-compatible SSE, reassembles tool_call
  argument deltas by index, asks for stream_options.include_usage.
- InsightChatService spawns the loop on a tokio task, feeds events
  through an mpsc channel, persists training_messages at the end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 16:57:41 -04:00
Cameron
65ab10e9a8 feat(ai): chat rewind + ollama metrics logging
Rewind: POST /insights/chat/rewind truncates training_messages at a
given rendered index, dropping the target message plus any preceding
tool-call scaffolding. The initial user prompt is protected.

Metrics: log prompt_eval_count/duration and eval_count/duration from
every Ollama chat response, rendered as tokens + ms + tok/s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 15:16:32 -04:00
Cameron
0b9528f61e feat(ai): chat continuation for photo insights (server v1)
Adds POST /insights/chat and GET /insights/chat/history. Replays the
stored agentic conversation through the same backend the insight was
generated with (or a per-turn override), runs a short tool-calling
loop, and persists the extended history in append or amend mode.

Backend switching: same-backend or hybrid->local replay verbatim;
local->hybrid is rejected in v1 (would require on-the-fly vision
description rewrite).

Per-(library, file) async mutex serialises concurrent turns. Soft
context budget drops oldest tool_call+result pairs when the
serialized history exceeds num_ctx - 2048 tokens.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 13:00:27 -04:00
Cameron
e2eefbd156 feat(ai): curated OpenRouter model picker for hybrid backend
Add OPENROUTER_ALLOWED_MODELS env var and GET /insights/openrouter/models
endpoint returning the curated list verbatim. Drop the live capability
precheck in hybrid mode — trust the operator's allowlist; bad ids surface
as a chat-call error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:36:19 -04:00
Cameron
e799ba716c feat(ai): add OpenRouterClient implementing LlmClient
OpenAI-compatible client for OpenRouter. Translates canonical wire shapes at
the boundary: tool-call arguments stringify on send / parse on receive
(accepting both string and native-object forms); images rewritten from the
base64 images field into content-parts with image_url entries; role=tool
messages inherit tool_call_id from the preceding assistant's tool calls.

/models parsed into ModelCapabilities via supported_parameters (tool use)
and architecture.input_modalities (vision). 15-minute capabilities cache.
Bearer auth; HTTP-Referer / X-Title attribution headers optional.

Not wired into request routing yet — first consumer arrives with hybrid
backend mode. 11 unit tests cover the translation helpers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 22:18:29 -04:00
Cameron
0073409b3d refactor: introduce LlmClient trait (no-op)
Preparation for a second LLM backend (OpenRouter) and hybrid vision-local /
chat-remote mode. Shared wire types (ChatMessage, Tool, ToolCall, etc.) move
into a new src/ai/llm_client.rs and are re-exported from ollama.rs so
existing imports keep working. OllamaClient now implements LlmClient.

No behavior change; callers still hold the concrete OllamaClient. Caller
migration to Arc<dyn LlmClient> is deferred to the PR that wires hybrid
backend routing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 22:11:05 -04:00
Cameron
c703a47f17 Add the ability to rate insights to curate training data 2026-04-13 09:23:40 -04:00
Cameron
091327e5d9 feat: add POST /insights/generate/agentic handler and route
Register the agentic insight endpoint that validates tool-calling capability,
runs the agentic loop, and returns the stored PhotoInsightResponse. Returns 400
for unsupported models, 500 for other errors. Max iterations configurable via
AGENTIC_MAX_ITERATIONS env var (default 10).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 23:01:25 -04:00
Cameron
af35a996a3 Cleanup unused message embedding code
Fixup some warnings
2026-01-14 13:33:36 -05:00
Cameron
e9729e9956 Fix unused import from binary from getting removed with cargo fix 2026-01-14 13:13:06 -05:00
Cameron
ad0bba63b4 Add check for vision capabilities 2026-01-11 15:22:24 -05:00
Cameron
084994e0b5 Daily Summary Embedding Testing 2026-01-08 13:41:32 -05:00
Cameron
bb23e6bb25 Cargo fix 2026-01-05 10:31:34 -05:00
Cameron
11e725c443 Enhanced Insights with daily summary embeddings
Bump to 0.5.0. Added daily summary generation job
2026-01-05 09:13:16 -05:00
Cameron
cf52d4ab76 Add Insights Model Discovery and Fallback Handling 2026-01-03 20:27:34 -05:00
Cameron
1171f19845 Create Insight Generation Feature
Added integration with Messages API and Ollama
2026-01-03 10:30:37 -05:00