341 Commits

Author SHA1 Message Date
Cameron Cordes
57b7bad086 duplicates: library-aware visibility — only hide a demoted row when its survivor is reachable
Soft-marked rows used to disappear from /photos globally, including
from a library-scoped view that didn't contain the survivor at all.
A user browsing lib A who'd promoted a file from lib B as the
survivor would silently lose visibility on their own copy in lib A,
even though lib B's file isn't reachable from lib A's view.

Library-scoped queries now keep a demoted row visible when its
survivor lives in a library outside the current scope. Implemented
as a NOT EXISTS subquery against the same image_exif table aliased
as `survivor`. The unscoped (all-libraries) view is unchanged — every
survivor is reachable, so demoted rows stay hidden as before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 18:24:07 -04:00
Cameron Cordes
98057c98a1 duplicates: tighten perceptual cluster — entropy band, asymmetric dHash, medoid prune
Three changes against "still too loose at lowest sensitivity":

- Popcount entropy band tightened from [8, 56] to [16, 48]. The wider
  band let too much low-frequency content through (skies, scans,
  faded film) where pHash collapses to near-uniform values that
  Hamming-trivially across hundreds of unrelated images.
- dHash check now uses an asymmetric stricter threshold
  (dhash_threshold = max(2, threshold/2)). pHash is the candidate-
  discovery signal; dHash is validation. Splitting the budget means
  a real near-dup survives both while incidental pHash collisions
  on uniform content get vetoed. Missing dHash on either side now
  rejects the edge (was: trust pHash alone).
- Single-link union-find can chain weakly-similar images via
  transitive edges. Added a medoid-validation pass: per cluster,
  pick the member with smallest summed distance to others, then
  drop any whose distance to it exceeds threshold. Two new tests
  pin both invariants.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 18:19:48 -04:00
Cameron Cordes
7ca888e95d duplicates: filter low-entropy hashes + dHash double-check, fix backfill loop
The perceptual cluster was producing one giant first group that
contained hundreds of unrelated images. Two causes:
- Solid-colour images (skies, black frames, monochrome scans) all
  hash to near-zero pHashes that Hamming-distance-zero to each other.
- Single-link clustering on pHash alone is too permissive — a chain
  of weakly-similar images all collapses into one cluster.

Fixed by skipping hashes outside the popcount [8, 56] band (uniform
content) and requiring dHash agreement within threshold before
unioning a candidate edge from the BK-tree. Two new tests pin both
invariants.

Backfill bin separately fix: decode-failed rows kept phash_64=NULL
and got re-pulled by every batch, infinite-looping on a queue of
unbreakable formats. Persist a 0/0 sentinel on decode failure so
the row leaves the candidate set; the all-zero hash is excluded
from clustering by the same entropy filter so it doesn't pollute
results.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 18:08:05 -04:00
Cameron Cordes
7584cd8792 duplicates: perceptual hash + soft-mark resolution + upload 409
Adds pHash + dHash columns alongside the existing blake3 content_hash so
near-duplicates (re-encoded, resized, format-converted copies) become
queryable. /duplicates/{exact,perceptual} return groups; /duplicates/
{resolve,unresolve} flip a duplicate_of_hash soft-mark on losing rows
and union perceptual-only tag sets onto the survivor. The default
/photos listing filters duplicate_of_hash IS NULL so demoted siblings
stop cluttering the grid; include_duplicates=true opts back in for
Apollo's review modal. Upload now hashes bytes pre-write and returns
409 with the canonical sibling when a file's bytes already exist.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 17:36:01 -04:00
Cameron Cordes
fb4df4b195 style: cargo fmt sweep
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 19:01:00 -04:00
Cameron Cordes
1d9b9a0bc4 faces: avoid 40 MB row clone in /faces/embeddings
list_embeddings cloned the full FaceDetectionRow inside the filter_map
just to pair it with the base64-encoded embedding. The 2 KB BLOB was
already on the row — at 20k unassigned faces that's 40 MB of pointless
heap traffic per Apollo cluster-suggest run. Move the bytes out via
Option::take() so the row drops the BLOB instead of duplicating it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 19:00:55 -04:00
Cameron Cordes
814066551e multi-library: per-library excluded_dirs
Adds a nullable comma-separated TEXT column to the libraries table.
Effective excludes for a walk = (env-var globals) ∪
(library.excluded_dirs). Empty / NULL = no library-specific
extras; the global env var still applies.

Migration (2026-05-01-110000_libraries_excluded_dirs)

  ALTER TABLE libraries ADD COLUMN excluded_dirs TEXT. NULL on every
  existing row — no behavior change on upgrade.

Library struct + helpers (libraries.rs)

  - Library gains excluded_dirs: Vec<String>, parsed from the column
    by parse_excluded_dirs_column (drops empties / whitespace,
    matches the env-var parser).
  - Library::effective_excluded_dirs(globals) returns the union.
  - From<LibraryRow> hydrates the field on AppState construction so
    /libraries surfaces it.

Watcher / walkers / memories

  Every per-library walker now consults the effective set:
    - process_new_files (file-watch ingest, RAW/EXIF/face)
    - process_face_backlog (filter_excluded inherits)
    - create_thumbnails (startup + new-file branch)
    - update_media_counts (Prometheus gauge)
    - cleanup_orphaned_playlists (per-library source-existence check)
    - memories endpoint (PathExcluder)

  Effective set is computed once per per-library iteration in the
  watcher tick and threaded through; called functions retain their
  flat &[String] signature (no per-library awareness needed inside
  the walker primitives).

Use case: mount a parent directory while a sibling library covers
a child subtree, and exclude the child subtree from the parent so
the libraries don't double-walk / double-write image_exif. With
hash-keyed derived data (Branches B/C), the duplication-avoidance
is the only cost prevented — face / tag / insight sharing was
already correct via content_hash.

Tests: 228 pass (226 from previous + 2 new in libraries::tests:
parse_excluded_dirs_column edge cases,
effective_excluded_dirs_unions_global_and_per_library).

CLAUDE.md gains a "Per-library excludes" subsection of the
multi-library data model.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 19:54:17 +00:00
Cameron Cordes
3598bb2cfe multi-library: operator kill switch via libraries.enabled
A small follow-up to Branches A/B/C. Adds a nullable-default-1
boolean column to the `libraries` table that controls whether the
watcher considers the library at all. Useful for staging a new
mount before committing to ingest, and as a maintenance kill
switch when a library needs to be quiet without being unmounted.

Migration (2026-05-01-100000_libraries_enabled_flag)

  ALTER TABLE libraries ADD COLUMN enabled BOOLEAN NOT NULL DEFAULT 1.
  Existing rows stay enabled — no behavior change on upgrade.

Watcher gate (main.rs)

  At the top of the per-library loop, if !lib.enabled { continue; }
  — runs BEFORE the availability probe. Disabled libraries don't
  enter the health map, don't get probed, don't get ingest, don't
  get any maintenance pass. The initial sweep before the loop's
  first sleep also skips disabled libraries.

Orphan-GC consensus (library_maintenance.rs)

  all_libraries_online filters disabled libraries out of the
  consensus check — they're treated as out-of-scope, not as
  blockers. Otherwise flipping enabled=false would permanently
  halt orphan GC for the rest of the system, which is the opposite
  of the intended kill-switch semantics.

Cross-library duplicates: safe by construction. Hash-keyed derived
data (face_detections, tagged_photo with hash, photo_insights with
hash) is anchored by ANY image_exif row carrying the hash. Disabling
a library does NOT delete its image_exif rows, so a hash referenced
by a disabled library's row stays anchored — derived data survives.
collect_orphan_hashes deliberately doesn't filter image_exif by
library.enabled for exactly this reason.

No HTTP endpoint. Library mutation is rare-enough infra work that a
SQL toggle is fine, and a public mutation endpoint without a role /
permission story would be poorly-prioritized exposure for a
single-user tool. Documented in CLAUDE.md.

Tests: 226 pass (225 from Branch C + 1 new
all_libraries_online_treats_disabled_as_out_of_scope, which proves
that even an explicit Stale entry on a disabled library doesn't
block the consensus).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 19:10:24 +00:00
Cameron Cordes
d809ddee44 library_maintenance: clarify orphan-gc log wording
"marked 2 new" parses as "2 new files" on first read — but the
unit is content_hashes, and the action is observing them as
orphaned (becoming-deleted, not appearing). Reword:

  "{} new orphan hash(es) marked, {} revived"

instead of "marked {} new, revived {}". Also pluralize the deleted
counts ("row(s)") and append the pending-set size to the success
log so a tick that both deletes and re-marks doesn't lose the
trailing-state context.

No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 18:01:01 +00:00
Cameron Cordes
fa98d147be library_maintenance: log orphan-gc decisions in stale-library path too
run_orphan_gc returned early on the !all_online branch before the
final debug/info log line, so the GC was effectively invisible
whenever any library was Stale — exactly the dry-run scenario where
operators most want to confirm the safety gate is firing. Add the
same conditional log inside the early-return branch (plus a
"deferred — at least one library Stale" hint in the info-level
variant when there's something newly marked).

No behavior change beyond observability.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 17:14:09 +00:00
Cameron Cordes
263e27e108 multi-library: handoff + orphan GC with two-tick consensus
Branch C of the multi-library data-model rollout. Implements the
operational maintenance pipeline pinned in CLAUDE.md → "Multi-library
data model" / "Library availability and safety". Branches A and B
land first; this branch builds on top.

New module: src/library_maintenance.rs

Three idempotent passes the watcher runs every tick after the
per-library ingest loop:

1. Missing-file scan (per online library)

   For each Online library, load a paginated page of image_exif rows
   (IMAGE_EXIF_MISSING_SCAN_PAGE_SIZE, default 500), stat() each one,
   and delete rows whose source file is NotFound. Permission/IO
   errors are skipped, never deleted. Capped at
   IMAGE_EXIF_MISSING_DELETE_CAP_PER_TICK (default 200) per library
   per tick — so a pathological mount that returns NotFound for
   everything can't wipe the table in one cycle. Cursor advances
   across ticks, wraps on partial-page returns, and naturally cycles
   through the entire library over many minutes. Skipped wholesale
   for Stale libraries via the existing probe gate.

2. Back-ref refresh (DB-only)

   For face_detections / tagged_photo / photo_insights: any
   hash-keyed row whose (library_id, rel_path) no longer matches an
   image_exif row, but whose content_hash does, is repointed at a
   surviving image_exif location. Pure SQL with EXISTS guards so
   rows whose hash is fully orphaned are left alone (the orphan GC
   handles those). Idempotent; no availability gate needed.

   This is what makes a recent → archive move invisible to readers:
   when pass 1 retires the lib-A row, pass 2 pivots tags / faces /
   insights to lib-B's surviving path before any client notices.

3. Orphan GC (destructive)

   Hash-keyed derived rows whose content_hash has no image_exif
   referent are GC-eligible. Two-tick consensus: a hash must be
   observed orphaned on two consecutive ticks AND every library must
   be Online for both. A single Stale tick within the window cancels
   all pending deletes (they remain marked but won't be promoted) —
   they're re-evaluated next tick. The pending set lives in
   OrphanGcState (in-memory); a watcher restart resets it, which can
   only delay a delete, never cause one. Hashes that re-appear in
   image_exif between ticks are "revived" from the pending set
   (handles transient share unmount / remount).

Two new ExifDao methods:
  - list_rel_paths_for_library_page(library_id, limit, offset) for
    the paginated missing-file scan.
  - (count_for_library landed in Branch A.)

Watcher wiring (main.rs)

Per-library: missing-file scan inside the existing per-library
loop, after process_new_files, gated by the same probe check that
already protects ingest. After the loop: reconcile (Branch B),
back-ref refresh, then run_orphan_gc. The maintenance connection is
opened once per tick (image_api::database::connect), used by all
three DB-only passes, and dropped at end of tick.

CLAUDE.md gains a "Maintenance pipeline" subsection that describes
the three passes and their interaction with the existing
availability-and-safety policy.

Tests: 225 pass (217 from Branch B + 8 new in library_maintenance
covering back-ref refresh including the fully-orphaned no-op case,
two-tick GC consensus, Stale-tick consensus reset, image_exif
re-appearance revival, multi-table delete, and the
all_libraries_online helper).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:27:53 +00:00
Cameron Cordes
48cac8c285 multi-library: hash-keyed tagged_photo + photo_insights with reconciliation
Branch B of the multi-library data-model rollout. tagged_photo and
photo_insights now follow the bytes (content_hash), not the path,
matching the policy pinned in CLAUDE.md "Multi-library data model".
Branch A's availability probe and EXIF scoping land first; this
branch builds on top.

Migration (2026-05-01-000000_hash_keyed_derived_data)

  Adds nullable content_hash columns to tagged_photo and photo_insights,
  with partial indexes on the non-null subset to keep the index small
  during the transitional window. The migration backfills from
  image_exif:
    * tagged_photo joins on rel_path alone (no library_id available);
    * photo_insights joins on (library_id, rel_path), unambiguous.
  Rows whose image_exif hash isn't known yet stay null and the runtime
  reconciliation pass populates them as the hash backlog drains.

Insert-time population

  TagDao::tag_file looks up image_exif.content_hash by rel_path before
  inserting; the hash is written into the new column.
  InsightDao::store_insight does the same scoped to (library_id,
  rel_path). Caller-supplied hash on InsertPhotoInsight wins; otherwise
  the DAO does the lookup. Both paths fall back to None if the hash
  isn't known yet — reconciliation backfills.

Reconciliation (database/reconcile.rs)

  Three idempotent passes the watcher runs once per tick after the
  per-library backfill loop:
    1. tagged_photo NULL hashes → populate from image_exif by rel_path.
    2. photo_insights NULL hashes → populate by (library_id, rel_path).
    3. photo_insights scalar merge — when multiple is_current rows
       share a content_hash, keep the earliest generated_at as
       current; demote the rest. Demoted rows keep their data so
       /insights/history is unaffected; only the "current" pointer
       narrows to one per hash.

  No filesystem dependency, so reconcile doesn't need the availability
  gate; runs every tick. Logs once when something changed, debug
  otherwise.

  Tags are set-valued under the policy (union on read, already
  DISTINCT in queries), so there is no analogous tag-collapse pass —
  duplicate (tag_id, content_hash) rows across libraries are
  harmless.

Read paths are unchanged in this branch — lookup_tags_batch's
existing rel_path-via-hash-sibling expansion still produces the
correct merge. A follow-up can simplify reads to use the new column
directly for performance.

Tests: 217 pass (212 pre-existing + 5 new in reconcile covering
NULL-fill, hash-not-yet-known no-op, library scoping on insights,
earliest-wins collapse, idempotency).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:52:16 +00:00
Cameron Cordes
48ed7be5d9 libraries: initial availability sweep before watcher's first sleep
new_health_map seeds every library as Online, and the watcher's tick
loop sleeps WATCH_QUICK_INTERVAL_SECONDS (default 60s) before its
first probe — meaning /libraries reported the optimistic default for
up to a minute after boot, even when a share was clearly unmounted.

Run the same refresh_health pass once at the top of the watcher
thread before entering the sleep loop. /libraries is then truthful
within milliseconds of the watcher thread starting (effectively from
the first HTTP request, since the watcher spawns well before the
server binds).

The per-tick gate inside the loop is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:33:45 +00:00
Cameron Cordes
eea1bf3181 multi-library: availability probe + scoped EXIF queries + collision fixes
Branch A of the multi-library data-model rollout. Three threads of
correctness/safety work that ship together because the new mount
needs all three before it can land:

1. Library availability probe (libraries.rs, state.rs, main.rs)

   New LibraryHealth (Online | Stale { reason, since }) and a shared
   LibraryHealthMap on AppState. Probe checks root_path exists +
   is_dir + readable + non-empty (relative to a "had_data" signal so
   fresh mounts aren't downgraded). The watcher tick begins with a
   refresh_health() per library; stale libraries skip ingest, the
   hash backfill, and face-detection backlog drains for that tick.
   The orphaned-playlist cleanup also gates on every library being
   online — a missing source on a stale library is indistinguishable
   from a transient unmount, and the cleanup is destructive.

   /libraries now returns each library with its current health
   state. Logs only on Online↔Stale transitions so a long outage
   doesn't spam.

   New ExifDao::count_for_library is the "had_data" signal.

2. EXIF queries scoped by library_id (database/mod.rs, files.rs,
   main.rs, tags.rs)

   query_by_exif gains an Option<i32> library filter; /photos and
   /photos/exif now pass it. Without this, an EXIF-filtered request
   scoped to ?library=N returned cross-library results because the
   handler resolved the library but didn't push it through to SQL.

   get_exif_batch gains the same option. The watcher's per-library
   ingest, face-candidate build, and content-hash backfill all
   scope to their library; the union-mode /photos date-sort path
   and the library-agnostic tag fan-out (lookup_tags_batch, by
   design) keep using None.

3. Derivative-path collision fixes (content_hash.rs, main.rs)

   New content_hash::library_scoped_legacy_path helper:
   <derivative_dir>/<library_id>/<rel_path>. Thumbnail generation
   (startup walk + watcher needs-thumb check) and serving now use
   it; serving falls back to the bare-legacy mirrored path so
   pre-multi-library deployments keep working without
   regeneration. Without this, lib2 with the same rel_path as lib1
   would have its thumbnail request short-circuit to lib1's image.

   Orphaned-playlist cleanup walks every library when checking for
   the source video (was: BASE_PATH only). Without this, mounting
   a 2nd library and waiting 24h would delete every playlist whose
   source lived only in the 2nd library.

   The HLS playlist write path collision (filename-only basename,
   not rel_path) is left as a known issue with a TODO at the call
   site — the actor-pipeline rewrite belongs in Branch B/C.

Tests: 212 pass (cargo test --lib). New tests cover the probe
states (online / missing root / non-dir / empty-with-prior-data),
refresh_health transitions, query_by_exif scoping, get_exif_batch
keying on (library_id, rel_path), library_scoped_legacy_path, and
count_for_library.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:12:49 +00:00
Cameron
98601973f7 faces: log at the three 503 paths in update_face_handler
PATCH /image/faces/{id} can return 503 from three places (face client
disabled, transient embed error, mid-flight disable) and none of them
were logging — operator sees the status code but nothing in the Rust
log explaining why. Add warn! lines at each so future bbox-edit
failures aren't silent. Response body is unchanged so existing clients
keep working.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 20:57:51 -04:00
Cameron
44d677528e tags: add edit + delete endpoints, enable FK enforcement
PUT /image/tags/{id} renames a tag globally; DELETE /image/tags/{id}
removes a tag and every photo's reference. Rename returns 200/404/409
(case-insensitive name conflict) / 400 (empty name); delete returns
204/404. New migration adds a UNIQUE COLLATE NOCASE index on
tags.name with a pre-flight pass that collapses existing case-
insensitive duplicates onto the lowest id.

The connection setup now sets PRAGMA foreign_keys = ON. The schema
already declares ON DELETE CASCADE / SET NULL on several tables —
those clauses were documentation-only because SQLite has FK
enforcement off per-connection by default. Audited every
diesel::delete site; each touches either no inbound FKs or has a
matching policy. delete_tag relies on the tagged_photo cascade
instead of doing manual cleanup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 20:26:35 -04:00
Cameron Cordes
323097c650 faces: count distinct content_hash in stats total_photos
face_detections is keyed on content_hash (one row per unique bytes,
shared across libraries / duplicate paths) but total_photos was
COUNT(*) over image_exif rows. A file present at multiple rel_paths or
across libraries inflated the denominator without inflating the
numerator, leaving a permanent gap (e.g. 1101/1103 with nothing
actually pending detection).

Switch total_photos to COUNT(DISTINCT content_hash) so numerator and
denominator live in the same domain. Exclude rows with NULL
content_hash from the count — they're held in the hash-backfill
backlog, not the detection backlog, and counting them pins the bar
below 100% for the duration of that pass.

CLAUDE.md: document the stats domain rule next to the rest of the
face-detection notes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 22:41:20 +00:00
Cameron Cordes
67abd8d8ff style: cargo fmt
Pre-existing whitespace drift in test bodies, normalized by rustfmt.
No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 21:16:34 +00:00
Cameron Cordes
0840d55c70 faces: exclude videos from backlog drain and SCANNED denominator
list_unscanned_candidates pulled every hashed image_exif row, including
videos. filter_excluded then dropped them client-side without writing a
marker, so the same set re-appeared every watcher tick — emitting the
"backlog drain — running detection on N candidate(s)" log forever and
producing no progress.

face_stats.total_photos counted the same video rows in the denominator,
so the SCANNED percentage was structurally capped below 100%.

Add an image-extension SQL predicate (case-insensitive, sourced from
file_types::IMAGE_EXTENSIONS) and apply it to both queries. Videos
never enter the candidate set, total_photos counts only what can
actually be scanned, and 100% becomes reachable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 21:16:30 +00:00
Cameron Cordes
f50655fb21 indexer: apply EXCLUDED_DIRS to remaining WalkDir callers
Audit follow-up to 5bf4956. The same `@eaDir` pruning that protects
the indexer also needs to protect the other walks under library roots:

- `create_thumbnails` walks every file in every library to generate
  thumbnails. Without EXCLUDED_DIRS, it would generate thumbnails of
  Synology's `SYNOFILE_THUMB_*.jpg` thumbnails (thumbnails of thumbnails).
- `update_media_counts` walks for the prometheus IMAGE / VIDEO gauges.
  Without EXCLUDED_DIRS, the gauges over-count by however many phantom
  `@eaDir` images live alongside the real photos.
- `cleanup_orphaned_playlists` walks BASE_PATH searching for source
  videos by filename. EXCLUDED_DIRS isn't a behavior change for typical
  Synology mounts (no .mp4 in @eaDir), but it's a correctness win for
  any operator-defined exclude that happens to contain video.

Refactor: add `walk_library_files(base, excluded_dirs) -> Vec<DirEntry>`
to file_scan.rs as the shared primitive. `enumerate_indexable_files`
now layers media-type + mtime filters on top of it. One new test
covers the lower-level helper (returns all extensions, prunes excluded
subtrees).

`generate_video_gifs` (currently `#[allow(dead_code)]`, not reachable
from main) gets the `update_media_counts` signature update and reads
EXCLUDED_DIRS from env so a future revival isn't broken — but its
WalkDir walk stays raw because the dual lib/bin compile makes the
file_scan module path non-trivial there. Tagged with a comment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 20:21:17 +00:00
Cameron Cordes
5bf49568f1 indexer: prune EXCLUDED_DIRS at WalkDir time, extract enumerate_indexable_files
Synology drops `@eaDir/.../SYNOFILE_THUMB_*.jpg` files alongside every
photo. The face-detect pipeline already filters those out via
`face_watch::filter_excluded`, but the filter runs *after* the indexer
has already inserted rows into `image_exif`. Result: phantom rows whose
content_hash never matches a `face_detections` row, so the anti-join in
`list_unscanned_candidates` returns them every tick. They're filtered
out at runtime, no marker is written, and the cycle repeats forever —
log spam, wrong stats denominator, and on a real Synology library the
phantom rows balloon into the hundreds of thousands.

Move the exclusion to the WalkDir pass, where filter_entry can prune
whole subtrees instead of walking and discarding leaves. Extract the
pre-existing 30-line walker chain in main.rs::process_new_files into
`file_scan::enumerate_indexable_files` so it's testable in isolation.

Six tests cover the bug (eadir prune), nested patterns, absolute-under-base
syntax, non-media filtering, modified_since semantics, and forward-slash
rel_path normalization.

Out of scope (other WalkDir callers in main.rs that don't yet apply
EXCLUDED_DIRS — thumbnail gen at 1309, media scan at 1377, video
playlist scan at 1685, and two nested walks at 1709 / 1743): separate
audit PR.

Operator note: existing phantom rows still need a one-shot cleanup —
  DELETE FROM face_detections WHERE content_hash IN (
    SELECT content_hash FROM image_exif WHERE rel_path LIKE '%/@eaDir/%'
  );
  DELETE FROM image_exif WHERE rel_path LIKE '%/@eaDir/%' OR rel_path LIKE '@eaDir/%';
Run before attaching a fresh Synology-sourced library.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 19:29:37 +00:00
Cameron Cordes
db9dc63e5e sqlite: enable WAL + busy_timeout in connect(); 408/413/429 transient
The DB connection helper now sets `journal_mode=WAL`, `busy_timeout=5000`,
and `synchronous=NORMAL` on every connection. 13+ DAOs each open their
own connection through this helper and share one SQLite file — without
WAL, a writer's exclusive lock blocks readers and `load_persons` racing
the face-watch write storm errored instantly with "database is locked".
GPU face inference made this visible by speeding detect ~10× and
flooding the writer side. WAL persists in the file once set so the
debug binaries that bypass connect() inherit it automatically.

Also widen face_client.rs's classifier: 408 / 413 / 429 are now Transient
instead of Permanent. These are operator-fixable proxy/infra errors;
marking them Permanent poisons every affected photo with status='failed'
and requires manual SQL to recover. Specifically, Apollo's nginx
defaulted to a 1 MB body cap and silently rejected normal-size photos
before they reached the backend — the deferred-and-retry contract is
the right behavior for that class of fault.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 18:13:15 +00:00
Cameron Cordes
5e1bad3179 faces: filter videos out of detection candidate set
The backlog drain pulls every hashed image_exif row, which includes videos.
Sending them to Apollo just produces 422 decode_failed → status='failed'
markers, burning a round-trip per video and inflating the FAILED stat.

Widen filter_excluded to also drop anything is_image_file rejects. Covers
both call sites (file-watch hook and per-tick backlog drain) without
plumbing a second filter through.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 12:45:55 +00:00
Cameron Cordes
1971eeccd6 faces: drain backfill + detection backlog every tick, not just full scans
Symptom: ImageApi restart, then ~60 minutes of silence — no
face_watch lines at all. Cause: backfill + face-detection candidate
build were both gated inside process_new_files, which during quick
scans (every 60s) only walks files modified in the last interval.
The pre-existing unhashed / unscanned backlog never entered the
candidate set, so it only drained on the full-scan path (default
once per hour). Surfaced as "scan stuck at 1101/13118" — most of
those rows were waiting on the next full scan.

Two new per-tick passes that work directly off the DB:

(1) backfill_unhashed_backlog uses ExifDao::get_rows_missing_hash to
    pull unhashed rows in id order, capped (FACE_HASH_BACKFILL_MAX_PER_TICK
    default 2000), and writes content_hash for each. No filesystem
    walk — the walk was the gating filter that hid the backlog.

(2) process_face_backlog uses a new FaceDao::list_unscanned_candidates
    (LEFT-anti-join on content_hash via raw SQL, GROUP BY hash so
    duplicates fire one detect call) to pull a capped batch of
    hashed-but-unscanned rows (FACE_BACKLOG_MAX_PER_TICK default 64)
    and runs the existing face_watch detection pipeline on them.

Both run only when face_client.is_enabled(). The cap on (2) is small
because each candidate is a real Apollo round-trip — 64/tick at 60s
quick interval ≈ 64 detections/min, which paces an 8-core CPU
inference comfortably while keeping a steady flow visible in logs.
process_new_files's own backfill stays in place for the same-tick
flow (a brand-new upload gets hashed AND face-scanned in the tick
where it's discovered) but is now belt-and-suspenders.

Test backstop pinning the new DAO method's filter contract: only
hashed, unscanned, in-library rows are returned; scanned rows,
unhashed rows, and other-library rows are filtered out.
2026-04-30 01:46:49 +00:00
Cameron Cordes
c2c1fe5b8b faces: bbox crop respects EXIF orientation + pads enough for RetinaFace
Two reasons manually-drawn bboxes were never resolving a face on
re-detection:

(1) The bbox arrives in display space (browser already applied EXIF
    orientation when rendering the carousel), but the `image` crate
    in crop_image_to_bbox opens raw pre-rotation pixels. For any
    phone photo with Orientation 6/8/etc., applying the bbox without
    rotating first crops a completely different region of the image
    — landing on background, hair, or empty pixels. Now reads the
    EXIF Orientation tag and applies it before indexing into the
    canonical-oriented dims.

(2) Padding was 10 % on each side. A typical 200×250 face bbox +
    10 % becomes ~240×300; insightface resizes that to det_size=640,
    so the face fills ~95 % of the input. RetinaFace's anchors
    expect faces at 20–60 % of input dimensions; at 95 % it
    routinely returns zero detections. Bumped to 50 % padding so the
    crop is 2× the bbox dims and the face occupies ~50 % of the
    input — anchor-friendly. Bbox is still clamped to image bounds,
    so edge-of-image cases just get less padding on the clipped
    side.

Together these explain why bbox-edit re-embed practically always
fell into the "no face detected" branch (and bbox-edit reverts
without the recent soft-fallback commit). Per-photo embedding
quality also improves slightly — same face, more context, better
landmarks for ArcFace.
2026-04-30 01:06:08 +00:00
Cameron Cordes
5a2f406429 faces: bbox edits survive when re-detection finds no face
Moving a tagged bbox off-center (to fine-tune position, or onto a
back-of-head the operator already manually tagged) made
update_face_handler 422 because the re-embed step ran detection on
the new crop and found nothing. Frontend's catch then reverted the
optimistic update — visible as the bbox snapping back the moment the
user released their drag.

The re-embed is a soft contract: a fresh ArcFace vector is preferable,
but the operator's bbox edit is sacred. Now:

  - empty faces[] → keep old embedding, apply the bbox, log info
  - permanent embed error → keep old embedding, apply the bbox, log info
  - bad-bytes embedding → keep old embedding, apply the bbox, log warn
  - transient failure (cuda_oom, engine unavailable) still 503s so
    the operator can retry — those are recoverable and we don't want
    to silently drift cluster math on retries that succeed later

Cost: a slightly stale embedding for the row, which marginally
affects clustering / auto-bind cosine for files re-detected against
this person. Accepted because dropping the user's manual drag every
time the new crop happens to lose detection is a much worse UX —
especially for the force-create rows (back of head, profile) where
re-detection will *always* fail.
2026-04-30 01:01:07 +00:00
Cameron Cordes
6a6a4a6a46 tags: batch lookup expands content-hash siblings cross-library
The first cut matched by rel_path only — fine for single-library
deploys but wrong for multi-library setups where the same content
lives under different rel_paths (e.g. a backup mount holding copies
of the primary library). A tag applied under library A would silently
not appear in the library-B grid badge even though the carousel's
per-path /image/tags would resolve it correctly via siblings.

The batch handler now does the expansion server-side in three queries
regardless of input size:

  1. image_exif batch lookup → query path → content_hash
  2. image_exif JOIN by content_hash → all sibling rel_paths sharing
     each hash (paths are deduped across libraries)
  3. tagged_photo + tags JOIN over the union of (query + sibling)
     rel_paths

Tags are then aggregated back to query paths via a sibling→originals
reverse map, deduped by tag id. Files without a content_hash (just
indexed, hash compute pending, etc.) skip step 2 and only get tags
from their own rel_path — same fallback the per-path handler uses.

Adds ExifDao::get_rel_paths_for_hashes (batch counterpart of
get_rel_paths_by_hash) chunked at 500 to stay under SQLite's
SQLITE_LIMIT_VARIABLE_NUMBER. Five queries for a 4k-photo grid is
still ~800x cheaper than per-path HTTP fan-out.
2026-04-30 00:36:44 +00:00
Cameron Cordes
3112260dc8 tags: batch lookup endpoint to collapse photo-match fan-out
Apollo's photo-match enrichment fanned out one ``GET /image/tags?path=``
per record (bounded concurrency 20) — for a 4k-photo time window that
meant ~4000 round-trips, each briefly contending the tag-dao mutex.
The cost dwarfed the actual SQL.

Add a single ``POST /image/tags/lookup`` body ``{paths: [...]}``
returning ``{path: [tag, ...]}`` with only paths that have at least
one tag. SqliteTagDao gains ``get_tags_grouped_by_paths`` which JOINs
tagged_photo + tags and chunks the IN clause at 500 (safely under
SQLite's variable limit). Five queries for a 4k-photo grid is ~800x
cheaper than 4k HTTP calls.

Trade-off: the batch matches by rel_path directly and does not do the
cross-library content-hash sibling expansion that the per-path
``GET /image/tags`` does. For Apollo's grid that's accepted as
deliberate — single-library deploys see no difference, multi-library
deploys with rel_path-divergent siblings might miss a tag in the grid
badge but the carousel still resolves full sibling tags via the
per-path endpoint when opened. If sibling sharing in the grid becomes
load-bearing, extend the handler to JOIN image_exif on content_hash.
2026-04-30 00:28:33 +00:00
Cameron Cordes
16abacf4c5 faces: backfill no longer stalls on chronic-error files at the front
The content-hash backfill capped at 500/tick AND counted errors
against that cap. So a pocket of files that errored every time
(vanished mid-scan, permission denied, unreadable) at the head of the
exif_records iteration order burned the entire budget every tick and
the rest of the backlog never advanced — surfacing as a face-scan
stuck at e.g. 44% with no progress. Without a content_hash, those
photos never become face-detection candidates, so it looks like
detection is broken when really it's the prerequisite hash that
isn't filling.

Two fixes:

  - Cap on successes only. Errors still get counted and logged but
    don't burn the per-tick budget; the loop keeps moving past them
    to the working files behind. Errors are bounded by the unhashed
    backlog size (each record walked at most once per tick), so this
    can't run away.

  - Always log the unhashed backlog count when non-zero. Previously
    "stuck at 44%" looked silent from the outside; now every tick
    surfaces "backfilled N/M; K still need backfill" so an operator
    can tell backfill is making progress (or isn't).

Also bumps the default cap from 500 to 2000. Hashing is cheap (blake3
+ one DB UPDATE), and 500 was conservative for a personal-scale
library where 10k+ unhashed files is a normal first-run state.
2026-04-30 00:03:26 +00:00
Cameron Cordes
891a9982ef faces: force-create path for regions the detector can't see
Adds an opt-in 'force' flag to POST /image/faces. When set, the handler
skips the Apollo embed call entirely and stores the row with a
2048-byte zero-vector embedding under the sentinel model_version
'manual_no_embed'. The row participates as a browse-by-person tag but
is excluded from clustering and auto-bind:

- face_clustering._decode_b64_embedding filters norm<=0 (already)
- cluster suggester groups by model_version, so the sentinel never
  mixes with real buffalo_l rows
- cosine_similarity with a zero vector resolves to 0/NaN, never
  crossing the 0.4 auto-bind threshold

Use case: tag someone looking away from the camera, profile shot,
heavily-occluded face — anywhere the detector returns no_face_in_crop
on the user's drawn region. The frontend only sets force=true after a
422 from a strict create plus an explicit operator confirmation, so
the normal "draw a centered face" UX still gets a real ArcFace
embedding.
2026-04-29 23:49:34 +00:00
Cameron Cordes
0eaf27d2d3 faces: cover hydrate_face_with_person — assigned + unassigned branches
Two unit tests pinning the response shape that PATCH/POST /image/faces
relies on. They use the existing in-memory SQLite harness and exercise
the helper directly:

- assigned: person_name resolves through the persons join and bbox /
  source / person_id round-trip cleanly.
- unassigned: person_name is None (not stale, not omitted), person_id
  is None.

These would have caught the prior regression — when the handlers
returned a bare FaceDetectionRow, person_name was structurally absent
from the response shape. A test that asserts person_name is populated
when person_id is set forces the join (or any equivalent) to exist.

A dangling-person_id case isn't covered: the FK on face_detections
makes that state structurally impossible at rest (ON DELETE SET NULL
zeroes the column when a person is removed), so there's nothing to
defend against.
2026-04-29 23:41:52 +00:00
Cameron Cordes
0c2f421a1f faces: PATCH/POST /image/faces returns person_name with the row
Both create_face_handler and update_face_handler returned the bare
FaceDetectionRow, so PATCH /image/faces/{id} (used by both bbox edits
and person assignment) replied without person_name. The carousel
overlay does an optimistic replace on this row — replacing the joined
FaceWithPerson with a row that has person_name = undefined visibly
dropped the VFD label off the bbox after every save.

Add a small hydrate_face_with_person helper that does the persons
lookup and assembles a FaceWithPerson, used by both handlers. The
list endpoint already does the join, so the PATCH/POST shape now
matches it.
2026-04-29 23:38:24 +00:00
Cameron Cordes
43cb60d3ad faces: re-embed on bbox edit instead of leaving the embedding stale
Phase 2 stored the new bbox on PATCH /image/faces/{id} but logged
"embedding now stale (Phase 3 will re-embed)" and moved on. That left
the embedding column pointing at the *old* face area while the bbox
described a new one — auto-bind cosine similarity and the cluster
suggester would silently rank the row as "the same face it was before
the edit" forever after, even though the geometry no longer matched.

Now: when the PATCH includes a bbox, the handler:
  1. Looks up the row to find its photo (library_id + rel_path).
  2. Crops the new bbox region with the same crop_image_to_bbox helper
     manual-create uses (10% pad on each side so the detector has
     ear/jaw context).
  3. POSTs the crop to face_client.embed for a fresh ArcFace vector.
  4. Stores both the new bbox AND the new embedding in one
     update_face transaction.

Errors map cleanly:
  - face_client disabled → 503 (bbox edit needs Apollo).
  - decode failure / no face in crop → 422.
  - Apollo CUDA OOM / unavailable → 503 transient.
  - Underlying row missing → 404.

About 100-500ms per edit on CPU, dominated by Apollo's inference call.
Acceptable for a manual operator action; the alternative (stale
embedding) silently broke the rest of the face stack.

Prerequisite for the upcoming carousel-side draw/resize bbox UI —
without re-embed, every operator-driven bbox tweak would corrode the
clustering/auto-bind quality. ApiPatchFaceBody on Apollo's side
already passes bbox through verbatim, so no Apollo change needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 23:10:25 +00:00
Cameron Cordes
7303fb8aa3 faces: ignore/junk bucket — DB schema + lazy-create endpoint
A single global "Ignored" person row, marked is_ignored=true, that the
frontend lazily creates on first use to hold strangers, false
detections, and faces the user doesn't want bound to a real person.

Schema (new migration 2026-04-29-000200_add_is_ignored):
  - persons.is_ignored BOOLEAN NOT NULL DEFAULT 0
  - Partial index on (is_ignored) WHERE is_ignored = 1; small WHERE
    set means a tiny index that only ever services the bucket lookup.

Why a real persons row instead of a separate table or status enum:
  - face_detections.person_id stays a clean foreign key — no special
    code paths for "ignored faces" anywhere else in the schema.
  - The cluster-suggester already filters by `person_id IS NULL`, so
    bound-to-ignored faces are naturally excluded from re-clustering
    without any change.
  - merge / rename / delete all work on it with the existing routes
    (the management UI just hides it from default views).

DAO additions / changes:
  - get_or_create_ignored_person (idempotent; race-safe via the
    UNIQUE COLLATE NOCASE on persons.name + retry-on-409 fallback).
  - list_persons gains an include_ignored parameter; default false
    so the management screen hides the bucket unless asked.
  - find_persons_by_names_ci filters is_ignored=0 in SQL so the
    auto-bind path can NEVER target the bucket — even if the user
    happens to tag photos as "Ignored", the heuristic look-up skips
    it. Bucket assignment is always an explicit operator action.
  - update_person accepts is_ignored: Option<bool> so a person can
    be moved into / out of the bucket without a delete + recreate.

Routes:
  - POST /persons/ignore-bucket — returns the bucket, creating it on
    first call. Frontend uses this lazily right before binding.
  - GET /persons gains ?include_ignored=true; default behavior
    unchanged.
  - PATCH /persons/{id} now accepts is_ignored.

Tests: ignore_bucket_idempotent_and_filters_auto_bind covers the
contract: bucket is idempotent across calls, find_persons_by_names_ci
skips it (even on exact name match), default list_persons hides it,
include_ignored=true surfaces it. All other tests updated to pass
the new is_ignored: false / Option<bool> fields explicitly.

cargo test --lib: 181/0; fmt + clippy clean for new code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 22:48:16 +00:00
Cameron Cordes
0e160f5d22 faces: include bbox on /faces/embeddings response
Apollo's cluster suggester wants to render a *face*-cropped thumbnail
for each cluster's representative — a multi-person photo with the
cluster about 'one' of them was unreadable when the thumb showed the
whole image. Plumbing bbox through means the UI can crop to the rep
face without an extra round-trip per cluster.

FaceEmbeddingRow gains bbox_x/y/w/h (Optional<f32>, mirrors the column
nullability — for status='detected' rows the CHECK constraint
guarantees they're populated, but the type stays nullable as
documentation). list_embeddings already loaded these from the
underlying FaceDetectionRow; this commit just stops dropping them
when constructing the response.

No DB changes; no behavior change for existing callers (the new
fields are additive).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 21:01:58 +00:00
Cameron Cordes
a24fac5511 faces: backfill missing content_hash from the file watcher
Photos indexed before content-hashing landed (or where the hash compute
failed silently on insert) end up in image_exif with NULL content_hash.
build_face_candidates keys on content_hash, so those rows would never
become face candidates without backfill — symptom: face detection logs
nothing despite photos being in the library and the watcher running.

The dedicated `backfill_hashes` binary already handles this; this
commit lets the watcher self-heal during full scans so the deploy
'just works' for face recognition without operator action.

Idempotent — subsequent scans see populated hashes and no-op. Bounded
per tick by FACE_HASH_BACKFILL_MAX_PER_TICK (default 500) so a watcher
tick on a 50k-photo legacy library doesn't blake3 every file in one
shot. For very large backlogs the dedicated binary is still faster
(no DAO mutex contention with the watcher loop).

Only runs when face_client.is_enabled(), so legacy deploys without
APOLLO_FACE_API_BASE_URL keep the same behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 20:41:08 +00:00
Cameron Cordes
23f4941471 faces: surface enabled/disabled state + per-tick candidate count
Manual deploy debugging: 'Saved thumbnail' logs were visible (boot-time
thumbnail backfill) but no face_watch logs were appearing, with no
obvious way to tell whether the integration was disabled, hadn't reached
a full scan yet, or had simply seen no new files.

Two log lines:
  - watch_files startup: 'Face detection: ENABLED' / 'DISABLED (set
    APOLLO_FACE_API_BASE_URL or APOLLO_API_BASE_URL to enable)' so
    you can tell at a glance whether the env wired through.
  - process_new_files (debug-level): 'face_watch: scan tick — N image
    file(s) walked, M candidate(s) (library 'main', modified_since=...)'
    so an empty-candidate scan is distinguishable from a misconfigured
    or skipped one without bumping log level for the rest of the
    watcher.

No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 20:19:17 +00:00
Cameron Cordes
41f93d70d1 faces: tighten bootstrap candidate filter, bump to 1.1.0
Filter <3-char tags and emoji/symbol-bearing tags out of the bootstrap
candidate list before grouping. Manual testing surfaced these as noise
the operator never tickets — they pushed real candidates lower in the
list and made the UI harder to scan. This is a hard filter (drop from
candidates entirely), not a heuristic flag — looks_like_person still
governs the default-checked decision for the rows that *do* survive.

is_plausible_name_token rules:
  - >= 3 chars after trimming (rejects "AB", "OK", whitespace-only)
  - Each char is alphabetic (any script — covers Renée, José, 田中太郎),
    whitespace, name-punctuation (' - . _ U+2019), or ASCII digit
  - Anything else (emoji, symbols, math, arrows, control codes) drops
    the whole tag

Digits stay allowed at this layer; looks_like_person handles "Trip 2018"
on the heuristic side. Lets a "Sarah2" alias still appear so the
operator can spot and confirm it manually, just unticked by default.

Cargo version bump 1.0.0 → 1.1.0 marks the face-recog feature surface
landing — Phase 2's schema + endpoints, Phase 3's file-watch hook, and
Phase 4's bootstrap + auto-bind are all behind APOLLO_FACE_API_BASE_URL,
so legacy 1.0 deploys without that env see no behavior change.

Tests: 1 new (faces::tests::is_plausible_name_token_filters_short_and_emoji)
covers the accept-list (Latin/accented/Asian scripts, hyphenated and
apostrophe names) and the reject-list (length floor, emoji classes,
symbols, leading/trailing whitespace handling).

cargo test --lib: 180 / 0; fmt + clippy clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 19:05:04 +00:00
Cameron Cordes
1859399759 faces: phase 4 — people-tag bootstrap + auto-bind on detection
Wires the existing string people-tags into the new persons table and
auto-binds new detections to a same-named person when the photo carries
exactly one matching tag. ImageApi has no notion of which tags are
people-tags today (purely a user mental model), so this is operator-
confirmed: the suggester surfaces candidates with a heuristic flag, the
operator confirms, then bootstrap creates persons rows. Auto-bind
follows on every detection thereafter.

New endpoints:
  GET  /tags/people-bootstrap-candidates
       Per case-insensitive name group: display name (most-frequent
       capitalization), normalized lowercase, summed usage_count,
       looks_like_person heuristic flag, already_exists check against
       the persons table. Sorted persons-likely-first then by count.
  POST /persons/bootstrap
       Body: {names: [string]}. Idempotent — pre-fetches the existing-
       name set so a duplicate request reports per-row "already exists"
       instead of 409-ing each insert. Created rows get
       created_from_tag=true; failed rows surface in `skipped` with a
       reason.

looks_like_person heuristic — conservative on purpose because the
operator confirms in the UI:
  - 1–2 whitespace-separated words
  - Each word starts uppercase, no digits anywhere
  - Single-word names not on a small denylist (cat, christmas, beach,
    sunset, untagged, ...). Two-word names skip the denylist so
    "Sarah Smith" is never false-rejected.

FaceDao additions:
  - find_persons_by_names_ci — bulk lowercase-name → person_id lookup
    via sql_query (Diesel's BoxedSelectStatement + LOWER() doesn't
    play well with the type system).
  - person_reference_embedding — L2-normalized mean of a person's
    detected embeddings, *filtered by model_version* so a future
    buffalo_xl row can never contaminate an in-flight buffalo_l auto-
    bind decision. Returns None when the person has no faces yet.
  - assign_face_to_person — sets face_detections.person_id and, only
    when persons.cover_face_id is NULL, claims this face as cover. The
    UI's hand-picked cover survives later auto-binds.
  - decode_embedding_bytes / cosine_similarity helpers — pub(crate)
    so face_watch can decode the wire bytes once and feed them through
    the cosine threshold.

Auto-bind in face_watch::process_one:
  After every successful detect, for each newly-stored auto face we
  pull the photo's tags, look up which (if any) map to existing
  persons, and:
    - skip when zero or multiple distinct persons are matched
      (multi-match is genuinely ambiguous; cluster suggester handles it)
    - on first face for a person: bind unconditionally so bootstrap can
      ever produce a usable reference
    - thereafter: bind iff cosine(new_emb, person_ref) >=
      FACE_AUTOBIND_MIN_COS (default 0.4, env-tunable to 0..=1)
  The reference embedding comes from person_reference_embedding under
  the same model_version as the candidate, so a model upgrade never
  silently re-anchors a person's centroid.

Plumbing: watch_files now constructs its own SqliteTagDao alongside the
other watcher DAOs and threads it through process_new_files →
run_face_detection_pass → process_one. The handler-side TagDao
registration in main.rs already covers bootstrap_candidates_handler;
no extra app_data wiring needed.

Tests: 8 new (faces.rs):
  - looks_like_person accepts/rejects/two-word-skips-denylist (3)
  - cosine_similarity on identical / orthogonal / opposite / mismatch /
    zero / empty inputs
  - decode_embedding_bytes round-trip + size validation
  - find_persons_by_names_ci groups case + handles empty input
  - person_reference_embedding filters by model_version (buffalo_l ref
    must not include buffalo_xl rows)
  - assign_face_to_person sets cover when unset, doesn't overwrite

cargo test --lib: 179 / 0; fmt + clippy clean for new code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 18:55:01 +00:00
Cameron Cordes
f985a0d658 faces: surface UNIQUE constraint as 409, not 500
Manual smoke test caught a bug: POST /persons with a duplicate name
returned 500 with the body 'insert person Cameron' instead of the
intended 409 Conflict.

Root cause: the handler keyed on `format!("{}", e).contains("unique")`,
but anyhow's plain Display only renders the *outermost* context
("insert person Cameron") and hides the diesel error nested below
('UNIQUE constraint failed: persons.name'). The string check was a
false negative on every duplicate.

Fix: walk the source chain and downcast for
diesel::result::Error::DatabaseError(UniqueViolation, _) — exposed
via a shared `is_unique_violation` helper used by both
create_person_handler and update_person_handler. Error bodies for
non-unique failures now use `{:#}` so the body actually carries the
underlying cause when the user surfaces it.

merge_persons_handler also moves to `{:#}` for richer error bodies;
the "itself" check was already structural and unaffected.

Regression test (faces::tests::is_unique_violation_walks_chain) pins
both the bug shape ({} doesn't surface UNIQUE) and the fix
(is_unique_violation correctly downcasts the chain), so a future
refactor of error handling can't silently re-bury this.

cargo test --lib: 171 / 0; fmt + clippy clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 18:44:10 +00:00
Cameron Cordes
4dee7b6f73 faces: phase 3 — file-watch hook drives auto detection
Wire face detection into ImageApi's existing scan loop so new uploads
pick up faces automatically and the initial backlog grinds through on
full-scan ticks. No new job system; Phase 2's already_scanned check
makes the work implicitly idempotent (one face_detections row per
content_hash, including no_faces / failed marker rows).

face_watch.rs (new):
  - run_face_detection_pass(library, excluded_dirs, face_client,
    face_dao, candidates) — sync entry point. Builds a per-pass tokio
    runtime and fans out detect calls bounded by FACE_DETECT_CONCURRENCY
    (default 8). The watcher thread itself stays sync.
  - filter_excluded — applies the same PathExcluder /memories uses, so
    @eaDir / .thumbnails / EXCLUDED_DIRS-listed paths skip detection
    before we burn a detect call (and Apollo's GPU memory) on junk.
  - read_image_bytes_for_detect — RAW/HEIC route through
    extract_embedded_jpeg_preview because opencv-python-headless can't
    decode either; everything else gets a plain std::fs::read so EXIF
    orientation reaches Apollo's exif_transpose intact.
  - process_one — translates Apollo's response into the Phase 2 marker
    contract: faces[] empty → no_faces; FaceDetectError::Permanent →
    failed (don't retry); Transient → no marker (next scan retries);
    success with N faces → N detected rows with the embeddings unpacked.

main.rs (process_new_files + watch_files):
  - watch_files now also takes face_client + excluded_dirs; the watcher
    thread builds a SqliteFaceDao the same way it builds ExifDao /
    PreviewDao.
  - After the EXIF write loop, build_face_candidates queries image_exif
    for the just-walked image paths' content_hashes (covers new uploads
    and pre-existing backlog), filters out anything already_scanned, and
    hands the rest to face_watch::run_face_detection_pass.
  - Bypassed wholesale when face_client.is_enabled() is false — keeps
    the watcher usable on legacy deploys where Apollo isn't configured.

Tests: 5 face_watch unit tests cover the parts that don't need a real
Apollo:
  - filter_excluded drops dir-component patterns (@eaDir) without
    matching substring file names (eaDir-not-a-thing.jpg keeps).
  - filter_excluded drops absolute-under-base subtrees (/private).
  - empty EXCLUDED_DIRS short-circuits cleanly.
  - read_image_bytes_for_detect passes JPEG bytes through verbatim
    (orientation must reach Apollo unmodified).
  - read_image_bytes_for_detect falls through to plain read when a
    RAW-extension file has no embedded preview, so Apollo gets a chance
    to 422 and we mark failed rather than infinitely-retrying.

cargo test --lib: 170 / 0; fmt and clippy clean for new code.
End-to-end (drop a photo → face_detections row appears) needs Apollo
running and is deferred to deploy-time verification.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 18:21:19 +00:00
Cameron Cordes
f77e44b34d faces: fix PathExcluder false-positive + cover face_client/crop in tests
PathExcluder was iterating every component of the absolute path,
including the system prefix. Two of the existing memories tests had
been failing on master because tempdir() lives under /tmp on Linux
and a pattern like "tmp" then matched the system /tmp component
rather than anything the user actually asked to exclude. Phase 3's
file-watch hook will use the same code to skip @eaDir / .thumbnails
under each library's BASE_PATH, so the bug would hide every photo
on a host whose BASE_PATH passes through a directory named the same
as a user pattern.

Fix: store base in PathExcluder and strip it before scanning
components. A path that lives outside base falls through to the
no-match branch (defensive — nothing legit hits that today).

Also extracted the face_client error classification into a pure
classify_error_response(status, body) so the marker-row contract
with Apollo (422 → Permanent / 'failed', 5xx → Transient / defer)
is unit-testable without spinning up an HTTP server.

New tests:
  memories::tests::test_path_excluder_*           — 2 previously
    failing tests now pass.
  ai::face_client::tests::classify_*              — 4 cases:
    422 decode_failed → Permanent, 503 cuda_oom → Transient
    (handles both string and {code:..} detail shapes), 5xx →
    Transient + other 4xx → Permanent, unparseable HTML body still
    classifies on status.
  faces::tests::crop_*                            — 3 cases:
    invalid bbox rejected, valid bbox round-trips through JPEG
    decode, corner crop with 10% padding clamps inside source.

cargo test --lib: 165 passed / 0 failed (was 156 / 2 failed).
cargo fmt and clippy on new code clean. The remaining
sort_by clippy warnings in pre-existing files (memories.rs,
files.rs, exif.rs) are unrelated and present on master.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 18:09:44 +00:00
Cameron Cordes
860169032b faces: phase 2 — schema + manual face/person CRUD
Land the persistence model and HTTP surface for local face recognition.
Inference still lives in Apollo (Phase 1); this side adds the data home
plus every endpoint Apollo's UI and FileViewer-React will consume.

Schema (new migration 2026-04-29-000000_add_faces):
  - persons: visual identities. Optional entity_id bridges to the
    existing knowledge-graph entities table; auto-bridging is left to
    the management UI (we don't muddy LLM provenance from face rows).
    UNIQUE(name COLLATE NOCASE) so 'alice' / 'Alice' fold to one row.
  - face_detections: keyed on content_hash (cross-library dedup), with
    status='detected' carrying bbox + 512-d embedding BLOB, and
    'no_faces' / 'failed' marker rows that tell Phase 3's file watcher
    not to re-scan. Marker invariant enforced via CHECK; partial UNIQUE
    on content_hash WHERE status='no_faces' guards against double-marks.

Schema regenerated with `diesel print-schema` against a clean migration
run; joinables added for face_detections → libraries / persons and
persons → entities.

face_client.rs (sibling of apollo_client.rs):
  - reqwest multipart, 60 s timeout (CPU inference on a backlog can be
    slow; bounded threadpool on Apollo serializes calls anyway).
  - FaceDetectError::{Permanent, Transient, Disabled} — Phase 3 keys
    its marker-row decision on this. 422 → mark failed, 5xx → defer.
  - APOLLO_FACE_API_BASE_URL falls back to APOLLO_API_BASE_URL when
    unset; both unset = is_enabled() false, callers no-op.

faces.rs (DAO + handlers):
  - SqliteFaceDao implements the full FaceDao trait; person face counts
    go through sql_query because diesel's BoxedSelectStatement +
    group_by trips trait-resolver recursion.
  - merge_persons re-points face rows in a transaction, copies notes
    when target's are empty, deletes src.
  - manual POST /image/faces resolves content_hash through image_exif,
    crops the user-drawn bbox with 10% padding (detector wants context
    around ears/jaw), POSTs the crop to face_client.embed for a real
    ArcFace vector, then inserts source='manual'.
  - Cluster-suggest (Phase 6) gets its data from
    GET /faces/embeddings — base64-encoded paged BLOBs so Apollo's
    DBSCAN can stream them without ImageApi pre-aggregating.

Endpoints registered alongside add_*_services in main.rs:
  GET    /faces/stats?library=
  GET    /faces/embeddings?library=&unassigned=&limit=&offset=
  GET    /image/faces?path=&library=
  POST   /image/faces                        (manual create via embed)
  PATCH  /image/faces/{id}
  DELETE /image/faces/{id}
  GET    /persons?library=
  POST   /persons
  GET    /persons/{id}
  PATCH  /persons/{id}
  DELETE /persons/{id}?cascade=set_null|delete   (set_null default)
  POST   /persons/{id}/merge
  GET    /persons/{id}/faces?library=

The file-watch hook (Phase 3) and the rerun-on-one-photo handler
(Phase 6) live behind the FaceDao methods marked dead_code today —
they're called only when those phases land. Same shape for the trait
methods that aren't reached by Phase 2 routes.

Tests: 3 DAO unit tests cover person CRUD + case-insensitive uniqueness,
marker-row idempotency (mark_status is a no-op when any row exists),
and merge re-pointing faces.

Cargo.toml: reqwest gains the `multipart` feature.

cargo build / cargo test --lib / cargo fmt / cargo clippy --all-targets
all clean for the new code; the two pre-existing test_path_excluder
failures and the pre-existing sort_by clippy warnings are unrelated and
present on master.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 18:03:42 +00:00
Cameron Cordes
57fb0bcd3c EXIF GPS write: POST /image/exif/gps via exiftool
New endpoint accepts {path, library, latitude, longitude} and shells
out to exiftool to write GPSLatitude/GPSLongitude (with N/S, E/W refs)
into the file's EXIF in place. After the write, the handler
re-extracts EXIF and updates the image_exif row so the DB stays in
sync — the response carries the updated metadata block in one
round-trip. Falls through to store_exif if the row is missing.

`exif::write_gps` is the small helper. `-overwrite_original` so no
.orig sidecar is left behind. Validates lat/lon range + supports_exif
before spawning exiftool. Format support matches the existing read
path (JPEG / TIFF / RAW / HEIF / PNG / WebP) — videos still need a
different writer and aren't covered.

Apollo's "+ PIN" carousel button (separate commit on the Apollo side)
calls this through /api/photos/exif/gps. Drive-by: cargo fmt one-line
collapse on apollo_client.rs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 22:25:40 +00:00
Cameron Cordes
4ae7be35e9 Apollo Places: enrich insights with personal place name + notes
Optional integration with the sibling Apollo project's user-defined
Places (name + lat/lon + radius_m + description + category). When
APOLLO_API_BASE_URL is set, the per-photo location resolver folds the
most-specific containing Place into the LLM prompt's location string —
"Home (My house in Cambridge) — near Cambridge, MA" rather than the
city name alone. Smallest-radius wins; Apollo sorts server-side via
/api/places/contains, so the carousel badge in Apollo and the prompt
string here always agree.

Adds an agentic tool `get_personal_place_at(latitude, longitude)` that
the LLM can call during chat continuation. Tool description tells the
model the call returns the user's free-text notes, not just a name.
Deliberately narrow — no enumerate-all variant, lat/lon required.

Unset APOLLO_API_BASE_URL = legacy Nominatim-only path, tool is not
registered. 5 s timeout; all errors degrade silently.

Tests: 5 unit tests for compose_location_string (Apollo only, Nominatim
only, both, both-with-description, neither).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 19:11:12 +00:00
Cameron Cordes
6521a328bf RAW preview: exiftool fallback for MakerNote / SubIFD previews
kamadak-exif's In::PRIMARY / In::THUMBNAIL only address IFD0 and IFD1.
On modern Nikon NEFs the full-res review JPEG lives in the MakerNote's
PreviewIFD (and many Canon CR2s / DNGs put theirs in a SubIFD chain) —
both unreachable through the existing reader, so the previous patch
still produced no preview for those files and the pipeline fell through
to ffmpeg, which writes black frames when it can't decode the RAW.

Add a slow-path layer in extract_embedded_jpeg_preview that shells out
to exiftool for PreviewImage / JpgFromRaw / OtherImage (one process per
tag). All candidates from both layers are pooled and the largest valid
JPEG wins. exiftool not on PATH degrades to fast-path-only behavior
rather than breaking — the fallback is a strict superset.

Documented the new optional dependency in README.md and CLAUDE.md with
install commands for apt / brew / winget / choco.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 17:13:36 +00:00
Cameron Cordes
00b3c80141 RAW: try IFD0 + IFD1 for embedded preview, serve at full size
The thumbnail pipeline's embedded-JPEG extractor only checked IFD1
(THUMBNAIL), which on many Nikon NEFs is missing or zero-length even
when IFD0 (PRIMARY) carries a perfectly good 1-2 MP reduced-resolution
preview the camera writes for in-body review. The previous behavior
produced black thumbs on disk: the buggy IFD1 pointer resolved to a
short byte sequence that happened to satisfy the SOI sanity check,
image::load_from_memory accepted it, and the resize path quietly wrote
a black JPEG.

Now both IFDs are checked and the larger valid JPEG wins. Format-
agnostic: applies to every TIFF-based RAW (NEF / ARW / CR2 / DNG / RAF /
ORF / RW2 / PEF / SRW / TIFF). is_tiff_raw is now pub so main.rs can
gate its full-size handler on it.

Also extends the /image handler so size=full requests for RAW formats
serve the embedded preview as image/jpeg instead of NamedFile-streaming
the original RAW bytes - browsers can't decode a .nef container, so
<img src=...> would otherwise land as a broken image. Falls through to
NamedFile if no preview is present, preserving the historical behavior
for callers that genuinely want the original bytes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 16:52:10 +00:00
Cameron Cordes
7621282419 Thumb orientation + library filter on /photos/exif
Two follow-ups on the same feature branch:

1. Bake EXIF orientation into generated thumbnails. The `image` crate
   doesn't apply Orientation on load, and `save_with_format(..Jpeg)`
   drops EXIF — so portrait phone shots ended up sideways in any client
   that displays the cached thumb directly (no EXIF tag for the browser
   to compensate from). New `exif::read_orientation` reads the tag
   cheaply (no full EXIF parse) and `exif::apply_orientation` does the
   rotate/flip via image's existing `rotate90/180/270` + `fliph/flipv`.
   Applied in both branches of `generate_image_thumbnail` (RAW embedded-
   JPEG path and the regular `image::open` path). Existing thumbnails
   in the cache are still wrong-orientation; wipe the thumb dir or run
   a one-off backfill once this lands.

2. Optional `library` query param on `/photos/exif`. Accepts numeric id
   or name (same shape as `/image?library=...`), resolved via the
   existing `resolve_library_param` helper so a bad value 400s before
   we touch the DAO. Filter is applied post-query in the handler
   rather than pushed into `query_by_exif` to keep the DAO trait
   (and its test mocks) unchanged. Cheap enough at typical library
   counts; can be moved into SQL later if it ever isn't.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 17:29:36 -04:00
Cameron Cordes
c6f82ebaba Batch EXIF endpoint: GET /photos/exif
Adds a single round-trip projection of `image_exif` for every photo whose
`date_taken` falls in `[date_from, date_to]`. Wraps the existing
`ExifDao::query_by_exif` DAO method which already handles the SQL filter
in one query against the covering index — the only missing piece was
HTTP plumbing.

Designed for window-scoped consumers like Apollo's photo-to-track
matcher, which currently does N+1 (one `/photos` listing + one
`/image/metadata` per photo). Because `/image/metadata` serializes on
`Data<Mutex<dyn ExifDao>>`, that pattern can take 10s+ for windows with
hundreds of photos. The new endpoint takes one mutex acquisition for
the whole batch.

Response shape:
  { photos: [
      { file_path, library_id, library_name,
        camera_model, width, height,
        gps_latitude, gps_longitude, date_taken } ],
    total: N }

Two notes on scope:
- Photos with NULL `date_taken` are excluded by `query_by_exif`'s
  semantics. Filename-extracted dates are not synthesized here; rare
  callers that need that fallback can still hit `/image/metadata`.
- GPS columns are stored as f32 in image_exif to keep row size small;
  the JSON shape widens to f64 so clients don't have to know about the
  on-disk precision.

Library names are pre-mapped from `app_state.libraries` once and
stamped on each row, avoiding an O(rows × libraries) linear scan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 16:38:53 -04:00
Cameron
b9d5578653 feat(bins): multi-library populate_knowledge + progress UX
populate_knowledge now loads real libraries from the DB instead of
fabricating a single library_id=1 row from BASE_PATH. Adds --library
<id|name> to restrict the walk and validates --path against the selected
library roots. The full library set is still passed to InsightGenerator so
resolve_full_path can probe every root when an insight resolves to a
different library than the one being walked.

Adds indicatif progress bars across the long-running utility binaries via
a shared src/bin_progress.rs helper (determinate bar + open-ended spinner
with consistent styling). Per-batch info! noise is replaced by the bar's
throughput/ETA; warnings and errors route through pb.println so they
scroll above the bar instead of fighting with it.

  populate_knowledge   spinner during scan, determinate bar over all libs
  backfill_hashes      spinner with running hashed/missing/errors counts
  import_calendar      determinate bar; embedding/store failures inline
  import_location_*    determinate bar advancing by chunk size
  import_search_*      determinate bar; pb cloned into the spawn task
  cleanup_files P1     determinate bar over DB paths
  cleanup_files P2     determinate bar; pb.suspend() around y/n/a/s prompt

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 23:55:33 -04:00