Land the persistence model and HTTP surface for local face recognition.
Inference still lives in Apollo (Phase 1); this side adds the data home
plus every endpoint Apollo's UI and FileViewer-React will consume.
Schema (new migration 2026-04-29-000000_add_faces):
- persons: visual identities. Optional entity_id bridges to the
existing knowledge-graph entities table; auto-bridging is left to
the management UI (we don't muddy LLM provenance from face rows).
UNIQUE(name COLLATE NOCASE) so 'alice' / 'Alice' fold to one row.
- face_detections: keyed on content_hash (cross-library dedup), with
status='detected' carrying bbox + 512-d embedding BLOB, and
'no_faces' / 'failed' marker rows that tell Phase 3's file watcher
not to re-scan. Marker invariant enforced via CHECK; partial UNIQUE
on content_hash WHERE status='no_faces' guards against double-marks.
Schema regenerated with `diesel print-schema` against a clean migration
run; joinables added for face_detections → libraries / persons and
persons → entities.
face_client.rs (sibling of apollo_client.rs):
- reqwest multipart, 60 s timeout (CPU inference on a backlog can be
slow; bounded threadpool on Apollo serializes calls anyway).
- FaceDetectError::{Permanent, Transient, Disabled} — Phase 3 keys
its marker-row decision on this. 422 → mark failed, 5xx → defer.
- APOLLO_FACE_API_BASE_URL falls back to APOLLO_API_BASE_URL when
unset; both unset = is_enabled() false, callers no-op.
faces.rs (DAO + handlers):
- SqliteFaceDao implements the full FaceDao trait; person face counts
go through sql_query because diesel's BoxedSelectStatement +
group_by trips trait-resolver recursion.
- merge_persons re-points face rows in a transaction, copies notes
when target's are empty, deletes src.
- manual POST /image/faces resolves content_hash through image_exif,
crops the user-drawn bbox with 10% padding (detector wants context
around ears/jaw), POSTs the crop to face_client.embed for a real
ArcFace vector, then inserts source='manual'.
- Cluster-suggest (Phase 6) gets its data from
GET /faces/embeddings — base64-encoded paged BLOBs so Apollo's
DBSCAN can stream them without ImageApi pre-aggregating.
Endpoints registered alongside add_*_services in main.rs:
GET /faces/stats?library=
GET /faces/embeddings?library=&unassigned=&limit=&offset=
GET /image/faces?path=&library=
POST /image/faces (manual create via embed)
PATCH /image/faces/{id}
DELETE /image/faces/{id}
GET /persons?library=
POST /persons
GET /persons/{id}
PATCH /persons/{id}
DELETE /persons/{id}?cascade=set_null|delete (set_null default)
POST /persons/{id}/merge
GET /persons/{id}/faces?library=
The file-watch hook (Phase 3) and the rerun-on-one-photo handler
(Phase 6) live behind the FaceDao methods marked dead_code today —
they're called only when those phases land. Same shape for the trait
methods that aren't reached by Phase 2 routes.
Tests: 3 DAO unit tests cover person CRUD + case-insensitive uniqueness,
marker-row idempotency (mark_status is a no-op when any row exists),
and merge re-pointing faces.
Cargo.toml: reqwest gains the `multipart` feature.
cargo build / cargo test --lib / cargo fmt / cargo clippy --all-targets
all clean for the new code; the two pre-existing test_path_excluder
failures and the pre-existing sort_by clippy warnings are unrelated and
present on master.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New endpoint accepts {path, library, latitude, longitude} and shells
out to exiftool to write GPSLatitude/GPSLongitude (with N/S, E/W refs)
into the file's EXIF in place. After the write, the handler
re-extracts EXIF and updates the image_exif row so the DB stays in
sync — the response carries the updated metadata block in one
round-trip. Falls through to store_exif if the row is missing.
`exif::write_gps` is the small helper. `-overwrite_original` so no
.orig sidecar is left behind. Validates lat/lon range + supports_exif
before spawning exiftool. Format support matches the existing read
path (JPEG / TIFF / RAW / HEIF / PNG / WebP) — videos still need a
different writer and aren't covered.
Apollo's "+ PIN" carousel button (separate commit on the Apollo side)
calls this through /api/photos/exif/gps. Drive-by: cargo fmt one-line
collapse on apollo_client.rs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Optional integration with the sibling Apollo project's user-defined
Places (name + lat/lon + radius_m + description + category). When
APOLLO_API_BASE_URL is set, the per-photo location resolver folds the
most-specific containing Place into the LLM prompt's location string —
"Home (My house in Cambridge) — near Cambridge, MA" rather than the
city name alone. Smallest-radius wins; Apollo sorts server-side via
/api/places/contains, so the carousel badge in Apollo and the prompt
string here always agree.
Adds an agentic tool `get_personal_place_at(latitude, longitude)` that
the LLM can call during chat continuation. Tool description tells the
model the call returns the user's free-text notes, not just a name.
Deliberately narrow — no enumerate-all variant, lat/lon required.
Unset APOLLO_API_BASE_URL = legacy Nominatim-only path, tool is not
registered. 5 s timeout; all errors degrade silently.
Tests: 5 unit tests for compose_location_string (Apollo only, Nominatim
only, both, both-with-description, neither).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds POST /insights/chat and GET /insights/chat/history. Replays the
stored agentic conversation through the same backend the insight was
generated with (or a per-turn override), runs a short tool-calling
loop, and persists the extended history in append or amend mode.
Backend switching: same-backend or hybrid->local replay verbatim;
local->hybrid is rejected in v1 (would require on-the-fly vision
description rewrite).
Per-(library, file) async mutex serialises concurrent turns. Soft
context budget drops oldest tool_call+result pairs when the
serialized history exceeds num_ctx - 2048 tokens.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add OPENROUTER_ALLOWED_MODELS env var and GET /insights/openrouter/models
endpoint returning the curated list verbatim. Drop the live capability
precheck in hybrid mode — trust the operator's allowlist; bad ids surface
as a chat-call error.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a `backend` column to photo_insights (default 'local', migration
2026-04-20-000000) and a corresponding optional `backend` field on the
agentic request. When a request sets backend=hybrid:
- The local Ollama vision model is called once via describe_image to
produce a text description.
- The description is inlined into the first user message as text —
no base64 image is ever sent to the chat model.
- The agentic tool-calling loop and title generation route through an
OpenRouterClient (dispatched via &dyn LlmClient), letting the user
pick any tool-capable model from OpenRouter per request.
- describe_photo is removed from the offered tools since the description
is already present.
Embeddings and vision stay on local Ollama regardless of backend.
Hybrid mode requires OPENROUTER_API_KEY; handlers return a clear error
when hybrid is requested without it, and also when the selected
OpenRouter model lacks tool-calling support.
AppState gains an optional openrouter client built from
OPENROUTER_API_KEY / OPENROUTER_BASE_URL / OPENROUTER_DEFAULT_MODEL /
OPENROUTER_EMBEDDING_MODEL / attribution headers. Default model is
anthropic/claude-sonnet-4.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PreviewClipGenerator stripped a single base_path, so videos in a
non-primary library ended up with the absolute path as 'relative'.
On Windows, PathBuf::from(preview_clips_dir).join(absolute) replaces
with the absolute path, and .with_extension("mp4") on a .mp4 input
yields the input path — ffmpeg then errors out with 'cannot edit
existing files in place'.
The generator now holds Vec<Library> and strips whichever root
actually contains the video, with separator normalization to match
the rest of the code.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The /video/generate and /image/metadata handlers assumed files live under
the resolved library only, which broke when a mobile client passed no
library (union mode) but the file lived in a non-primary library. Both
now fall back to scanning every configured library for an existing file.
InsightGenerator held a single base_path, so vision-model loads and
filename-date fallbacks failed for non-primary libraries. It now takes
Vec<Library> and probes each root in resolve_full_path.
/image/metadata responses now carry library_id/library_name so the
mobile viewer can surface which library a file belongs to.
Thumbnail generation at startup is now spawned on a background thread
so the HTTP server can accept traffic while large libraries backfill.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a `libraries` registry table and threads library_id through
per-instance metadata tables (image_exif, photo_insights,
entity_photo_links, video_preview_clips). File-path columns renamed to
rel_path to make the relative-to-root semantics explicit. Adds
content_hash + size_bytes on image_exif to support future hash-keyed
thumbnail/HLS dedup. Tags and favorites stay library-agnostic so they
share across libraries by rel_path.
Behavior is unchanged: a single primary library (id=1) is seeded from
BASE_PATH on first boot; all handlers and DAOs route through it as a
transitional shim until the API gains a library query param.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implements persistent cross-photo knowledge memory so the agentic
insight loop can learn and recall facts about people, places, and
events across the photo collection.
Changes:
- photo_insights: drop UNIQUE(file_path) + INSERT OR REPLACE, replace
with append-only rows + is_current flag for insight history retention
- New tables: entities, entity_facts, entity_photo_links with FK
constraints and confidence scoring
- KnowledgeDao trait + SqliteKnowledgeDao with upsert, merge, and
corroboration (confidence +0.1 on duplicate fact detection)
- Four new agent tools: recall_entities, recall_facts_for_photo,
store_entity, store_fact (with object_entity_id FK support)
- Cameron entity auto-seeded with stable ID injected into system prompt
- Pre-run photo link clearing + post-loop source_insight_id backfill
- Audit REST API: GET/PATCH/DELETE /knowledge/entities/{id},
POST /knowledge/entities/merge, GET/PATCH/DELETE /knowledge/facts/{id},
GET /knowledge/recent
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- cargo fmt applied across all modified source files
- Collapse nested if let Some / if !is_empty into a single let-chain (clippy::collapsible_match)
- All other warnings are pre-existing dead-code lint on unused trait methods
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Threads SqliteTagDao through InsightGenerator and AppState (both default
and test_state). Adds Send+Sync bounds to TagDao trait with unsafe impls
for SqliteTagDao (always Mutex-protected) and TestTagDao (single-threaded).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Backend (Rust/Actix-web):
- Add video_preview_clips table and PreviewDao for tracking preview generation
- Add ffmpeg preview clip generator: 10 equally-spaced 1s segments at 480p with CUDA NVENC auto-detection
- Add PreviewClipGenerator actor with semaphore-limited concurrent processing
- Add GET /video/preview and POST /video/preview/status endpoints
- Extend file watcher to detect and queue previews for new videos
- Use relative paths consistently for DB storage (matching EXIF convention)
Frontend (React Native/Expo):
- Add VideoWall grid view with 2-3 column layout of looping preview clips
- Add VideoWallItem component with ActiveVideoPlayer sub-component for lifecycle management
- Add useVideoWall hook for batch status polling with 5s refresh
- Add navigation button in grid header (visible when videos exist)
- Use TextureView surface type to fix Android z-ordering issues
- Optimize memory: players only mount while visible via FlatList windowSize
- Configure ExoPlayer buffer options and caching for short clips
- Tap to toggle audio focus, long press to open in full viewer
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Updated InsightGenerator struct with calendar, location, and search DAOs
- Implemented hybrid context gathering methods:
* gather_calendar_context(): ±7 days with semantic ranking
* gather_location_context(): ±30 min with GPS proximity check
* gather_search_context(): ±30 days semantic search
- Added haversine_distance() utility for GPS calculations
- Updated generate_insight_for_photo_with_model() to use multi-source context
- Combined all context sources (SMS + Calendar + Location + Search) with equal weight
- Initialized new DAOs in AppState (both default and test implementations)
- All contexts are optional (graceful degradation if data missing)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>