ImageApi

Author	SHA1	Message	Date
Cameron Cordes	fb4df4b195	style: cargo fmt sweep Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 19:01:00 -04:00
Cameron Cordes	1d9b9a0bc4	faces: avoid 40 MB row clone in /faces/embeddings list_embeddings cloned the full FaceDetectionRow inside the filter_map just to pair it with the base64-encoded embedding. The 2 KB BLOB was already on the row — at 20k unassigned faces that's 40 MB of pointless heap traffic per Apollo cluster-suggest run. Move the bytes out via Option::take() so the row drops the BLOB instead of duplicating it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 19:00:55 -04:00
cameron	7998a0c9b0	Merge pull request 'feature/per-library-excluded-dirs' (#71 ) from feature/per-library-excluded-dirs into master Reviewed-on: #71	2026-05-01 20:11:10 +00:00
Cameron Cordes	58f010f302	docs(claude): pin excluded_dirs entry-form syntax The two entry shapes for libraries.excluded_dirs / EXCLUDED_DIRS are not symmetric: - /sub/path → multi-segment, library-root-anchored, recursive - name → single component anywhere in the tree Without this pinned, a reasonable read of the column doc would be "any path-like string works" — but a multi-segment string without a leading slash silently never matches (the no-slash form scans path components for exact string equality, and components are slash-free). No code change; just documentation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 20:05:58 +00:00
Cameron Cordes	814066551e	multi-library: per-library excluded_dirs Adds a nullable comma-separated TEXT column to the libraries table. Effective excludes for a walk = (env-var globals) ∪ (library.excluded_dirs). Empty / NULL = no library-specific extras; the global env var still applies. Migration (2026-05-01-110000_libraries_excluded_dirs) ALTER TABLE libraries ADD COLUMN excluded_dirs TEXT. NULL on every existing row — no behavior change on upgrade. Library struct + helpers (libraries.rs) - Library gains excluded_dirs: Vec<String>, parsed from the column by parse_excluded_dirs_column (drops empties / whitespace, matches the env-var parser). - Library::effective_excluded_dirs(globals) returns the union. - From<LibraryRow> hydrates the field on AppState construction so /libraries surfaces it. Watcher / walkers / memories Every per-library walker now consults the effective set: - process_new_files (file-watch ingest, RAW/EXIF/face) - process_face_backlog (filter_excluded inherits) - create_thumbnails (startup + new-file branch) - update_media_counts (Prometheus gauge) - cleanup_orphaned_playlists (per-library source-existence check) - memories endpoint (PathExcluder) Effective set is computed once per per-library iteration in the watcher tick and threaded through; called functions retain their flat &[String] signature (no per-library awareness needed inside the walker primitives). Use case: mount a parent directory while a sibling library covers a child subtree, and exclude the child subtree from the parent so the libraries don't double-walk / double-write image_exif. With hash-keyed derived data (Branches B/C), the duplication-avoidance is the only cost prevented — face / tag / insight sharing was already correct via content_hash. Tests: 228 pass (226 from previous + 2 new in libraries::tests: parse_excluded_dirs_column edge cases, effective_excluded_dirs_unions_global_and_per_library). CLAUDE.md gains a "Per-library excludes" subsection of the multi-library data model. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 19:54:17 +00:00
cameron	4f17af688e	Merge pull request 'multi-library: operator kill switch via libraries.enabled' (#70 ) from feature/library-enabled-flag into master Reviewed-on: #70	2026-05-01 19:15:20 +00:00
Cameron Cordes	3598bb2cfe	multi-library: operator kill switch via libraries.enabled A small follow-up to Branches A/B/C. Adds a nullable-default-1 boolean column to the `libraries` table that controls whether the watcher considers the library at all. Useful for staging a new mount before committing to ingest, and as a maintenance kill switch when a library needs to be quiet without being unmounted. Migration (2026-05-01-100000_libraries_enabled_flag) ALTER TABLE libraries ADD COLUMN enabled BOOLEAN NOT NULL DEFAULT 1. Existing rows stay enabled — no behavior change on upgrade. Watcher gate (main.rs) At the top of the per-library loop, if !lib.enabled { continue; } — runs BEFORE the availability probe. Disabled libraries don't enter the health map, don't get probed, don't get ingest, don't get any maintenance pass. The initial sweep before the loop's first sleep also skips disabled libraries. Orphan-GC consensus (library_maintenance.rs) all_libraries_online filters disabled libraries out of the consensus check — they're treated as out-of-scope, not as blockers. Otherwise flipping enabled=false would permanently halt orphan GC for the rest of the system, which is the opposite of the intended kill-switch semantics. Cross-library duplicates: safe by construction. Hash-keyed derived data (face_detections, tagged_photo with hash, photo_insights with hash) is anchored by ANY image_exif row carrying the hash. Disabling a library does NOT delete its image_exif rows, so a hash referenced by a disabled library's row stays anchored — derived data survives. collect_orphan_hashes deliberately doesn't filter image_exif by library.enabled for exactly this reason. No HTTP endpoint. Library mutation is rare-enough infra work that a SQL toggle is fine, and a public mutation endpoint without a role / permission story would be poorly-prioritized exposure for a single-user tool. Documented in CLAUDE.md. Tests: 226 pass (225 from Branch C + 1 new all_libraries_online_treats_disabled_as_out_of_scope, which proves that even an explicit Stale entry on a disabled library doesn't block the consensus). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 19:10:24 +00:00
cameron	23448cf5e6	Merge pull request 'feature/library-handoff-and-gc' (#69 ) from feature/library-handoff-and-gc into master Reviewed-on: #69	2026-05-01 18:27:40 +00:00
Cameron Cordes	d809ddee44	library_maintenance: clarify orphan-gc log wording "marked 2 new" parses as "2 new files" on first read — but the unit is content_hashes, and the action is observing them as orphaned (becoming-deleted, not appearing). Reword: "{} new orphan hash(es) marked, {} revived" instead of "marked {} new, revived {}". Also pluralize the deleted counts ("row(s)") and append the pending-set size to the success log so a tick that both deletes and re-marks doesn't lose the trailing-state context. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 18:01:01 +00:00
Cameron Cordes	fa98d147be	library_maintenance: log orphan-gc decisions in stale-library path too run_orphan_gc returned early on the !all_online branch before the final debug/info log line, so the GC was effectively invisible whenever any library was Stale — exactly the dry-run scenario where operators most want to confirm the safety gate is firing. Add the same conditional log inside the early-return branch (plus a "deferred — at least one library Stale" hint in the info-level variant when there's something newly marked). No behavior change beyond observability. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 17:14:09 +00:00
Cameron Cordes	5f247be1f1	docs(claude): note in-place edit gap as future Branch D The maintenance pipeline added in Branch C assumes (library_id, rel_path) bytes are stable for as long as the file lives at that path. In-place edits (crop, re-export to same name) bypass process_new_files's already-indexed check, so the row's content_hash stays pinned to the original bytes — tags / faces / insights remain attached to that hash silently. Document the gap and the proposed shape of the fix: - Stale-content detection pass: compare last_modified / size_bytes to fs::metadata, re-hash on mismatch, update image_exif. - "Content branched" semantics on hash change: faces re-run, tags migrate forward (user intent survives a crop), insights migrate + flag for re-generation, favorites follow path. - Apollo derived.db cache invalidation belongs in the same design cycle, not after. Captured here so the design intent is clear before someone hits the case in real life. No code change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 16:53:08 +00:00
Cameron Cordes	263e27e108	multi-library: handoff + orphan GC with two-tick consensus Branch C of the multi-library data-model rollout. Implements the operational maintenance pipeline pinned in CLAUDE.md → "Multi-library data model" / "Library availability and safety". Branches A and B land first; this branch builds on top. New module: src/library_maintenance.rs Three idempotent passes the watcher runs every tick after the per-library ingest loop: 1. Missing-file scan (per online library) For each Online library, load a paginated page of image_exif rows (IMAGE_EXIF_MISSING_SCAN_PAGE_SIZE, default 500), stat() each one, and delete rows whose source file is NotFound. Permission/IO errors are skipped, never deleted. Capped at IMAGE_EXIF_MISSING_DELETE_CAP_PER_TICK (default 200) per library per tick — so a pathological mount that returns NotFound for everything can't wipe the table in one cycle. Cursor advances across ticks, wraps on partial-page returns, and naturally cycles through the entire library over many minutes. Skipped wholesale for Stale libraries via the existing probe gate. 2. Back-ref refresh (DB-only) For face_detections / tagged_photo / photo_insights: any hash-keyed row whose (library_id, rel_path) no longer matches an image_exif row, but whose content_hash does, is repointed at a surviving image_exif location. Pure SQL with EXISTS guards so rows whose hash is fully orphaned are left alone (the orphan GC handles those). Idempotent; no availability gate needed. This is what makes a recent → archive move invisible to readers: when pass 1 retires the lib-A row, pass 2 pivots tags / faces / insights to lib-B's surviving path before any client notices. 3. Orphan GC (destructive) Hash-keyed derived rows whose content_hash has no image_exif referent are GC-eligible. Two-tick consensus: a hash must be observed orphaned on two consecutive ticks AND every library must be Online for both. A single Stale tick within the window cancels all pending deletes (they remain marked but won't be promoted) — they're re-evaluated next tick. The pending set lives in OrphanGcState (in-memory); a watcher restart resets it, which can only delay a delete, never cause one. Hashes that re-appear in image_exif between ticks are "revived" from the pending set (handles transient share unmount / remount). Two new ExifDao methods: - list_rel_paths_for_library_page(library_id, limit, offset) for the paginated missing-file scan. - (count_for_library landed in Branch A.) Watcher wiring (main.rs) Per-library: missing-file scan inside the existing per-library loop, after process_new_files, gated by the same probe check that already protects ingest. After the loop: reconcile (Branch B), back-ref refresh, then run_orphan_gc. The maintenance connection is opened once per tick (image_api::database::connect), used by all three DB-only passes, and dropped at end of tick. CLAUDE.md gains a "Maintenance pipeline" subsection that describes the three passes and their interaction with the existing availability-and-safety policy. Tests: 225 pass (217 from Branch B + 8 new in library_maintenance covering back-ref refresh including the fully-orphaned no-op case, two-tick GC consensus, Stale-tick consensus reset, image_exif re-appearance revival, multi-table delete, and the all_libraries_online helper). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 16:27:53 +00:00
cameron	a0283a6362	Merge pull request 'multi-library: hash-keyed tagged_photo + photo_insights with reconciliation' (#68 ) from feature/hash-keyed-derived-data into master Reviewed-on: #68	2026-05-01 16:16:38 +00:00
Cameron Cordes	48cac8c285	multi-library: hash-keyed tagged_photo + photo_insights with reconciliation Branch B of the multi-library data-model rollout. tagged_photo and photo_insights now follow the bytes (content_hash), not the path, matching the policy pinned in CLAUDE.md "Multi-library data model". Branch A's availability probe and EXIF scoping land first; this branch builds on top. Migration (2026-05-01-000000_hash_keyed_derived_data) Adds nullable content_hash columns to tagged_photo and photo_insights, with partial indexes on the non-null subset to keep the index small during the transitional window. The migration backfills from image_exif: * tagged_photo joins on rel_path alone (no library_id available); * photo_insights joins on (library_id, rel_path), unambiguous. Rows whose image_exif hash isn't known yet stay null and the runtime reconciliation pass populates them as the hash backlog drains. Insert-time population TagDao::tag_file looks up image_exif.content_hash by rel_path before inserting; the hash is written into the new column. InsightDao::store_insight does the same scoped to (library_id, rel_path). Caller-supplied hash on InsertPhotoInsight wins; otherwise the DAO does the lookup. Both paths fall back to None if the hash isn't known yet — reconciliation backfills. Reconciliation (database/reconcile.rs) Three idempotent passes the watcher runs once per tick after the per-library backfill loop: 1. tagged_photo NULL hashes → populate from image_exif by rel_path. 2. photo_insights NULL hashes → populate by (library_id, rel_path). 3. photo_insights scalar merge — when multiple is_current rows share a content_hash, keep the earliest generated_at as current; demote the rest. Demoted rows keep their data so /insights/history is unaffected; only the "current" pointer narrows to one per hash. No filesystem dependency, so reconcile doesn't need the availability gate; runs every tick. Logs once when something changed, debug otherwise. Tags are set-valued under the policy (union on read, already DISTINCT in queries), so there is no analogous tag-collapse pass — duplicate (tag_id, content_hash) rows across libraries are harmless. Read paths are unchanged in this branch — lookup_tags_batch's existing rel_path-via-hash-sibling expansion still produces the correct merge. A follow-up can simplify reads to use the new column directly for performance. Tests: 217 pass (212 pre-existing + 5 new in reconcile covering NULL-fill, hash-not-yet-known no-op, library scoping on insights, earliest-wins collapse, idempotency). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 14:52:16 +00:00
cameron	cce8f0c1b7	Merge pull request 'feature/multi-library-data-model' (#67 ) from feature/multi-library-data-model into master Reviewed-on: #67	2026-05-01 14:40:16 +00:00
Cameron Cordes	48ed7be5d9	libraries: initial availability sweep before watcher's first sleep new_health_map seeds every library as Online, and the watcher's tick loop sleeps WATCH_QUICK_INTERVAL_SECONDS (default 60s) before its first probe — meaning /libraries reported the optimistic default for up to a minute after boot, even when a share was clearly unmounted. Run the same refresh_health pass once at the top of the watcher thread before entering the sleep loop. /libraries is then truthful within milliseconds of the watcher thread starting (effectively from the first HTTP request, since the watcher spawns well before the server binds). The per-tick gate inside the loop is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 14:33:45 +00:00
Cameron Cordes	eea1bf3181	multi-library: availability probe + scoped EXIF queries + collision fixes Branch A of the multi-library data-model rollout. Three threads of correctness/safety work that ship together because the new mount needs all three before it can land: 1. Library availability probe (libraries.rs, state.rs, main.rs) New LibraryHealth (Online \| Stale { reason, since }) and a shared LibraryHealthMap on AppState. Probe checks root_path exists + is_dir + readable + non-empty (relative to a "had_data" signal so fresh mounts aren't downgraded). The watcher tick begins with a refresh_health() per library; stale libraries skip ingest, the hash backfill, and face-detection backlog drains for that tick. The orphaned-playlist cleanup also gates on every library being online — a missing source on a stale library is indistinguishable from a transient unmount, and the cleanup is destructive. /libraries now returns each library with its current health state. Logs only on Online↔Stale transitions so a long outage doesn't spam. New ExifDao::count_for_library is the "had_data" signal. 2. EXIF queries scoped by library_id (database/mod.rs, files.rs, main.rs, tags.rs) query_by_exif gains an Option<i32> library filter; /photos and /photos/exif now pass it. Without this, an EXIF-filtered request scoped to ?library=N returned cross-library results because the handler resolved the library but didn't push it through to SQL. get_exif_batch gains the same option. The watcher's per-library ingest, face-candidate build, and content-hash backfill all scope to their library; the union-mode /photos date-sort path and the library-agnostic tag fan-out (lookup_tags_batch, by design) keep using None. 3. Derivative-path collision fixes (content_hash.rs, main.rs) New content_hash::library_scoped_legacy_path helper: <derivative_dir>/<library_id>/<rel_path>. Thumbnail generation (startup walk + watcher needs-thumb check) and serving now use it; serving falls back to the bare-legacy mirrored path so pre-multi-library deployments keep working without regeneration. Without this, lib2 with the same rel_path as lib1 would have its thumbnail request short-circuit to lib1's image. Orphaned-playlist cleanup walks every library when checking for the source video (was: BASE_PATH only). Without this, mounting a 2nd library and waiting 24h would delete every playlist whose source lived only in the 2nd library. The HLS playlist write path collision (filename-only basename, not rel_path) is left as a known issue with a TODO at the call site — the actor-pipeline rewrite belongs in Branch B/C. Tests: 212 pass (cargo test --lib). New tests cover the probe states (online / missing root / non-dir / empty-with-prior-data), refresh_health transitions, query_by_exif scoping, get_exif_batch keying on (library_id, rel_path), library_scoped_legacy_path, and count_for_library. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 14:12:49 +00:00
Cameron Cordes	2f91891459	docs(claude): pin multi-library data model + availability/safety policy Adds a "Multi-library data model" section that classifies each table as intrinsic-to-bytes (hash-keyed), user-intent-about-a-photo (hash-keyed), or library-administrative ((library_id, rel_path)). Spells out merge semantics on read (union for set-valued, earliest-wins for scalar), write attribution (binds to bytes, not to current library), the transitional-state rules for hash-less rows, library handoff behavior on archive moves, and orphan GC. Adds a "Library availability and safety" subsection: every watcher tick begins with a presence probe; destructive paths (move-handoff re-keying, orphan GC) require both/all libraries online and confirmed-clean for two consecutive ticks. A NAS reboot, USB pull, or VPN drop must never trigger destruction — the worst case is that derived-data work pauses until the share returns. The face_detections table is referenced as the existing reference implementation of the policy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 14:11:42 +00:00
cameron	3d162105f7	Merge pull request 'feature/edit-tag' (#66 ) from feature/edit-tag into master Reviewed-on: #66	2026-05-01 01:03:40 +00:00
Cameron	98601973f7	faces: log at the three 503 paths in update_face_handler PATCH /image/faces/{id} can return 503 from three places (face client disabled, transient embed error, mid-flight disable) and none of them were logging — operator sees the status code but nothing in the Rust log explaining why. Add warn! lines at each so future bbox-edit failures aren't silent. Response body is unchanged so existing clients keep working. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 20:57:51 -04:00
Cameron	862917b0d1	gitignore: SQLite WAL runtime + local docs/specs dirs .db-shm / .db-wal show up in the working tree whenever the server runs (the WAL/journal pragmas in connect()), and /docs and /specs hold per-feature design notes that stay local per the project's "spec docs not in git" convention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 20:31:19 -04:00
Cameron	44d677528e	tags: add edit + delete endpoints, enable FK enforcement PUT /image/tags/{id} renames a tag globally; DELETE /image/tags/{id} removes a tag and every photo's reference. Rename returns 200/404/409 (case-insensitive name conflict) / 400 (empty name); delete returns 204/404. New migration adds a UNIQUE COLLATE NOCASE index on tags.name with a pre-flight pass that collapses existing case- insensitive duplicates onto the lowest id. The connection setup now sets PRAGMA foreign_keys = ON. The schema already declares ON DELETE CASCADE / SET NULL on several tables — those clauses were documentation-only because SQLite has FK enforcement off per-connection by default. Audited every diesel::delete site; each touches either no inbound FKs or has a matching policy. delete_tag relies on the tagged_photo cascade instead of doing manual cleanup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 20:26:35 -04:00
cameron	89b743ba54	Merge pull request 'faces: count distinct content_hash in stats total_photos' (#65 ) from face-stats-dedup-hash into master Reviewed-on: #65	2026-04-30 22:43:58 +00:00
Cameron Cordes	323097c650	faces: count distinct content_hash in stats total_photos face_detections is keyed on content_hash (one row per unique bytes, shared across libraries / duplicate paths) but total_photos was COUNT(*) over image_exif rows. A file present at multiple rel_paths or across libraries inflated the denominator without inflating the numerator, leaving a permanent gap (e.g. 1101/1103 with nothing actually pending detection). Switch total_photos to COUNT(DISTINCT content_hash) so numerator and denominator live in the same domain. Exclude rows with NULL content_hash from the count — they're held in the hash-backfill backlog, not the detection backlog, and counting them pins the bar below 100% for the duration of that pass. CLAUDE.md: document the stats domain rule next to the rest of the face-detection notes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 22:41:20 +00:00
cameron	d0833177c7	Merge pull request 'feature/face-stats-exclude-videos' (#64 ) from feature/face-stats-exclude-videos into master Reviewed-on: #64	2026-04-30 21:17:19 +00:00
Cameron Cordes	67abd8d8ff	style: cargo fmt Pre-existing whitespace drift in test bodies, normalized by rustfmt. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 21:16:34 +00:00
Cameron Cordes	0840d55c70	faces: exclude videos from backlog drain and SCANNED denominator list_unscanned_candidates pulled every hashed image_exif row, including videos. filter_excluded then dropped them client-side without writing a marker, so the same set re-appeared every watcher tick — emitting the "backlog drain — running detection on N candidate(s)" log forever and producing no progress. face_stats.total_photos counted the same video rows in the denominator, so the SCANNED percentage was structurally capped below 100%. Add an image-extension SQL predicate (case-insensitive, sourced from file_types::IMAGE_EXTENSIONS) and apply it to both queries. Videos never enter the candidate set, total_photos counts only what can actually be scanned, and 100% becomes reachable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 21:16:30 +00:00
cameron	dbb046dfa8	Merge pull request 'indexer: prune EXCLUDED_DIRS at WalkDir time, extract enumerate_indexable_files' (#63 ) from feature/exclude-dirs-at-index-time into master Reviewed-on: #63	2026-04-30 20:24:18 +00:00
Cameron Cordes	f50655fb21	indexer: apply EXCLUDED_DIRS to remaining WalkDir callers Audit follow-up to `5bf4956`. The same `@eaDir` pruning that protects the indexer also needs to protect the other walks under library roots: - `create_thumbnails` walks every file in every library to generate thumbnails. Without EXCLUDED_DIRS, it would generate thumbnails of Synology's `SYNOFILE_THUMB_*.jpg` thumbnails (thumbnails of thumbnails). - `update_media_counts` walks for the prometheus IMAGE / VIDEO gauges. Without EXCLUDED_DIRS, the gauges over-count by however many phantom `@eaDir` images live alongside the real photos. - `cleanup_orphaned_playlists` walks BASE_PATH searching for source videos by filename. EXCLUDED_DIRS isn't a behavior change for typical Synology mounts (no .mp4 in @eaDir), but it's a correctness win for any operator-defined exclude that happens to contain video. Refactor: add `walk_library_files(base, excluded_dirs) -> Vec<DirEntry>` to file_scan.rs as the shared primitive. `enumerate_indexable_files` now layers media-type + mtime filters on top of it. One new test covers the lower-level helper (returns all extensions, prunes excluded subtrees). `generate_video_gifs` (currently `#[allow(dead_code)]`, not reachable from main) gets the `update_media_counts` signature update and reads EXCLUDED_DIRS from env so a future revival isn't broken — but its WalkDir walk stays raw because the dual lib/bin compile makes the file_scan module path non-trivial there. Tagged with a comment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 20:21:17 +00:00
Cameron Cordes	5bf49568f1	indexer: prune EXCLUDED_DIRS at WalkDir time, extract enumerate_indexable_files Synology drops `@eaDir/.../SYNOFILE_THUMB_.jpg` files alongside every photo. The face-detect pipeline already filters those out via `face_watch::filter_excluded`, but the filter runs after* the indexer has already inserted rows into `image_exif`. Result: phantom rows whose content_hash never matches a `face_detections` row, so the anti-join in `list_unscanned_candidates` returns them every tick. They're filtered out at runtime, no marker is written, and the cycle repeats forever — log spam, wrong stats denominator, and on a real Synology library the phantom rows balloon into the hundreds of thousands. Move the exclusion to the WalkDir pass, where filter_entry can prune whole subtrees instead of walking and discarding leaves. Extract the pre-existing 30-line walker chain in main.rs::process_new_files into `file_scan::enumerate_indexable_files` so it's testable in isolation. Six tests cover the bug (eadir prune), nested patterns, absolute-under-base syntax, non-media filtering, modified_since semantics, and forward-slash rel_path normalization. Out of scope (other WalkDir callers in main.rs that don't yet apply EXCLUDED_DIRS — thumbnail gen at 1309, media scan at 1377, video playlist scan at 1685, and two nested walks at 1709 / 1743): separate audit PR. Operator note: existing phantom rows still need a one-shot cleanup — DELETE FROM face_detections WHERE content_hash IN ( SELECT content_hash FROM image_exif WHERE rel_path LIKE '%/@eaDir/%' ); DELETE FROM image_exif WHERE rel_path LIKE '%/@eaDir/%' OR rel_path LIKE '@eaDir/%'; Run before attaching a fresh Synology-sourced library. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 19:29:37 +00:00
cameron	f358e83050	Merge pull request 'sqlite: enable WAL + busy_timeout in connect(); 408/413/429 transient' (#62 ) from feature/sqlite-wal-and-413-transient into master Reviewed-on: #62	2026-04-30 18:16:38 +00:00
Cameron Cordes	db9dc63e5e	sqlite: enable WAL + busy_timeout in connect(); 408/413/429 transient The DB connection helper now sets `journal_mode=WAL`, `busy_timeout=5000`, and `synchronous=NORMAL` on every connection. 13+ DAOs each open their own connection through this helper and share one SQLite file — without WAL, a writer's exclusive lock blocks readers and `load_persons` racing the face-watch write storm errored instantly with "database is locked". GPU face inference made this visible by speeding detect ~10× and flooding the writer side. WAL persists in the file once set so the debug binaries that bypass connect() inherit it automatically. Also widen face_client.rs's classifier: 408 / 413 / 429 are now Transient instead of Permanent. These are operator-fixable proxy/infra errors; marking them Permanent poisons every affected photo with status='failed' and requires manual SQL to recover. Specifically, Apollo's nginx defaulted to a 1 MB body cap and silently rejected normal-size photos before they reached the backend — the deferred-and-retry contract is the right behavior for that class of fault. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 18:13:15 +00:00
cameron	9443c91f88	Merge pull request 'Face Recognition / People Integration' (#61 ) from feature/face-recog-phase3-file-watch into master Reviewed-on: #61	2026-04-30 17:22:08 +00:00
Cameron Cordes	96c539764c	docs: face detection system section + per-tick backlog drain env vars CLAUDE.md gets an "Important Patterns → Face detection system" entry covering the schema (why content_hash and not (library_id, rel_path)), the file-watch hook + per-tick backlog drains, auto-bind on tag-name match, manual-face create with EXIF orientation handling, and the rerun-preserves-manual-rows contract. README's face section adds the two new env vars (FACE_BACKLOG_MAX_PER_TICK and FACE_HASH_BACKFILL_MAX_PER_TICK) shipped this cycle so operators know they're tunable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 14:06:42 +00:00
Cameron Cordes	675b4a4849	faces: add .env.example template covering all documented env vars The face-recognition plan and CLAUDE.md document the full env-var surface (face detection knobs, Apollo / Ollama / OpenRouter / SMS integrations, watch intervals, RAG flags), but no example file existed — operators copying the project to a new deploy had nothing to start from. Group by section, comment out optional integrations so a minimal copy boots without external services. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 13:51:45 +00:00
Cameron Cordes	5e1bad3179	faces: filter videos out of detection candidate set The backlog drain pulls every hashed image_exif row, which includes videos. Sending them to Apollo just produces 422 decode_failed → status='failed' markers, burning a round-trip per video and inflating the FAILED stat. Widen filter_excluded to also drop anything is_image_file rejects. Covers both call sites (file-watch hook and per-tick backlog drain) without plumbing a second filter through. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 12:45:55 +00:00
Cameron Cordes	1971eeccd6	faces: drain backfill + detection backlog every tick, not just full scans Symptom: ImageApi restart, then ~60 minutes of silence — no face_watch lines at all. Cause: backfill + face-detection candidate build were both gated inside process_new_files, which during quick scans (every 60s) only walks files modified in the last interval. The pre-existing unhashed / unscanned backlog never entered the candidate set, so it only drained on the full-scan path (default once per hour). Surfaced as "scan stuck at 1101/13118" — most of those rows were waiting on the next full scan. Two new per-tick passes that work directly off the DB: (1) backfill_unhashed_backlog uses ExifDao::get_rows_missing_hash to pull unhashed rows in id order, capped (FACE_HASH_BACKFILL_MAX_PER_TICK default 2000), and writes content_hash for each. No filesystem walk — the walk was the gating filter that hid the backlog. (2) process_face_backlog uses a new FaceDao::list_unscanned_candidates (LEFT-anti-join on content_hash via raw SQL, GROUP BY hash so duplicates fire one detect call) to pull a capped batch of hashed-but-unscanned rows (FACE_BACKLOG_MAX_PER_TICK default 64) and runs the existing face_watch detection pipeline on them. Both run only when face_client.is_enabled(). The cap on (2) is small because each candidate is a real Apollo round-trip — 64/tick at 60s quick interval ≈ 64 detections/min, which paces an 8-core CPU inference comfortably while keeping a steady flow visible in logs. process_new_files's own backfill stays in place for the same-tick flow (a brand-new upload gets hashed AND face-scanned in the tick where it's discovered) but is now belt-and-suspenders. Test backstop pinning the new DAO method's filter contract: only hashed, unscanned, in-library rows are returned; scanned rows, unhashed rows, and other-library rows are filtered out.	2026-04-30 01:46:49 +00:00
Cameron Cordes	c2c1fe5b8b	faces: bbox crop respects EXIF orientation + pads enough for RetinaFace Two reasons manually-drawn bboxes were never resolving a face on re-detection: (1) The bbox arrives in display space (browser already applied EXIF orientation when rendering the carousel), but the `image` crate in crop_image_to_bbox opens raw pre-rotation pixels. For any phone photo with Orientation 6/8/etc., applying the bbox without rotating first crops a completely different region of the image — landing on background, hair, or empty pixels. Now reads the EXIF Orientation tag and applies it before indexing into the canonical-oriented dims. (2) Padding was 10 % on each side. A typical 200×250 face bbox + 10 % becomes ~240×300; insightface resizes that to det_size=640, so the face fills ~95 % of the input. RetinaFace's anchors expect faces at 20–60 % of input dimensions; at 95 % it routinely returns zero detections. Bumped to 50 % padding so the crop is 2× the bbox dims and the face occupies ~50 % of the input — anchor-friendly. Bbox is still clamped to image bounds, so edge-of-image cases just get less padding on the clipped side. Together these explain why bbox-edit re-embed practically always fell into the "no face detected" branch (and bbox-edit reverts without the recent soft-fallback commit). Per-photo embedding quality also improves slightly — same face, more context, better landmarks for ArcFace.	2026-04-30 01:06:08 +00:00
Cameron Cordes	5a2f406429	faces: bbox edits survive when re-detection finds no face Moving a tagged bbox off-center (to fine-tune position, or onto a back-of-head the operator already manually tagged) made update_face_handler 422 because the re-embed step ran detection on the new crop and found nothing. Frontend's catch then reverted the optimistic update — visible as the bbox snapping back the moment the user released their drag. The re-embed is a soft contract: a fresh ArcFace vector is preferable, but the operator's bbox edit is sacred. Now: - empty faces[] → keep old embedding, apply the bbox, log info - permanent embed error → keep old embedding, apply the bbox, log info - bad-bytes embedding → keep old embedding, apply the bbox, log warn - transient failure (cuda_oom, engine unavailable) still 503s so the operator can retry — those are recoverable and we don't want to silently drift cluster math on retries that succeed later Cost: a slightly stale embedding for the row, which marginally affects clustering / auto-bind cosine for files re-detected against this person. Accepted because dropping the user's manual drag every time the new crop happens to lose detection is a much worse UX — especially for the force-create rows (back of head, profile) where re-detection will always fail.	2026-04-30 01:01:07 +00:00
Cameron Cordes	6a6a4a6a46	tags: batch lookup expands content-hash siblings cross-library The first cut matched by rel_path only — fine for single-library deploys but wrong for multi-library setups where the same content lives under different rel_paths (e.g. a backup mount holding copies of the primary library). A tag applied under library A would silently not appear in the library-B grid badge even though the carousel's per-path /image/tags would resolve it correctly via siblings. The batch handler now does the expansion server-side in three queries regardless of input size: 1. image_exif batch lookup → query path → content_hash 2. image_exif JOIN by content_hash → all sibling rel_paths sharing each hash (paths are deduped across libraries) 3. tagged_photo + tags JOIN over the union of (query + sibling) rel_paths Tags are then aggregated back to query paths via a sibling→originals reverse map, deduped by tag id. Files without a content_hash (just indexed, hash compute pending, etc.) skip step 2 and only get tags from their own rel_path — same fallback the per-path handler uses. Adds ExifDao::get_rel_paths_for_hashes (batch counterpart of get_rel_paths_by_hash) chunked at 500 to stay under SQLite's SQLITE_LIMIT_VARIABLE_NUMBER. Five queries for a 4k-photo grid is still ~800x cheaper than per-path HTTP fan-out.	2026-04-30 00:36:44 +00:00
Cameron Cordes	3112260dc8	tags: batch lookup endpoint to collapse photo-match fan-out Apollo's photo-match enrichment fanned out one ``GET /image/tags?path=`` per record (bounded concurrency 20) — for a 4k-photo time window that meant ~4000 round-trips, each briefly contending the tag-dao mutex. The cost dwarfed the actual SQL. Add a single ``POST /image/tags/lookup`` body ``{paths: [...]}`` returning ``{path: [tag, ...]}`` with only paths that have at least one tag. SqliteTagDao gains ``get_tags_grouped_by_paths`` which JOINs tagged_photo + tags and chunks the IN clause at 500 (safely under SQLite's variable limit). Five queries for a 4k-photo grid is ~800x cheaper than 4k HTTP calls. Trade-off: the batch matches by rel_path directly and does not do the cross-library content-hash sibling expansion that the per-path ``GET /image/tags`` does. For Apollo's grid that's accepted as deliberate — single-library deploys see no difference, multi-library deploys with rel_path-divergent siblings might miss a tag in the grid badge but the carousel still resolves full sibling tags via the per-path endpoint when opened. If sibling sharing in the grid becomes load-bearing, extend the handler to JOIN image_exif on content_hash.	2026-04-30 00:28:33 +00:00
Cameron Cordes	16abacf4c5	faces: backfill no longer stalls on chronic-error files at the front The content-hash backfill capped at 500/tick AND counted errors against that cap. So a pocket of files that errored every time (vanished mid-scan, permission denied, unreadable) at the head of the exif_records iteration order burned the entire budget every tick and the rest of the backlog never advanced — surfacing as a face-scan stuck at e.g. 44% with no progress. Without a content_hash, those photos never become face-detection candidates, so it looks like detection is broken when really it's the prerequisite hash that isn't filling. Two fixes: - Cap on successes only. Errors still get counted and logged but don't burn the per-tick budget; the loop keeps moving past them to the working files behind. Errors are bounded by the unhashed backlog size (each record walked at most once per tick), so this can't run away. - Always log the unhashed backlog count when non-zero. Previously "stuck at 44%" looked silent from the outside; now every tick surfaces "backfilled N/M; K still need backfill" so an operator can tell backfill is making progress (or isn't). Also bumps the default cap from 500 to 2000. Hashing is cheap (blake3 + one DB UPDATE), and 500 was conservative for a personal-scale library where 10k+ unhashed files is a normal first-run state.	2026-04-30 00:03:26 +00:00
Cameron Cordes	891a9982ef	faces: force-create path for regions the detector can't see Adds an opt-in 'force' flag to POST /image/faces. When set, the handler skips the Apollo embed call entirely and stores the row with a 2048-byte zero-vector embedding under the sentinel model_version 'manual_no_embed'. The row participates as a browse-by-person tag but is excluded from clustering and auto-bind: - face_clustering._decode_b64_embedding filters norm<=0 (already) - cluster suggester groups by model_version, so the sentinel never mixes with real buffalo_l rows - cosine_similarity with a zero vector resolves to 0/NaN, never crossing the 0.4 auto-bind threshold Use case: tag someone looking away from the camera, profile shot, heavily-occluded face — anywhere the detector returns no_face_in_crop on the user's drawn region. The frontend only sets force=true after a 422 from a strict create plus an explicit operator confirmation, so the normal "draw a centered face" UX still gets a real ArcFace embedding.	2026-04-29 23:49:34 +00:00
Cameron Cordes	0eaf27d2d3	faces: cover hydrate_face_with_person — assigned + unassigned branches Two unit tests pinning the response shape that PATCH/POST /image/faces relies on. They use the existing in-memory SQLite harness and exercise the helper directly: - assigned: person_name resolves through the persons join and bbox / source / person_id round-trip cleanly. - unassigned: person_name is None (not stale, not omitted), person_id is None. These would have caught the prior regression — when the handlers returned a bare FaceDetectionRow, person_name was structurally absent from the response shape. A test that asserts person_name is populated when person_id is set forces the join (or any equivalent) to exist. A dangling-person_id case isn't covered: the FK on face_detections makes that state structurally impossible at rest (ON DELETE SET NULL zeroes the column when a person is removed), so there's nothing to defend against.	2026-04-29 23:41:52 +00:00
Cameron Cordes	0c2f421a1f	faces: PATCH/POST /image/faces returns person_name with the row Both create_face_handler and update_face_handler returned the bare FaceDetectionRow, so PATCH /image/faces/{id} (used by both bbox edits and person assignment) replied without person_name. The carousel overlay does an optimistic replace on this row — replacing the joined FaceWithPerson with a row that has person_name = undefined visibly dropped the VFD label off the bbox after every save. Add a small hydrate_face_with_person helper that does the persons lookup and assembles a FaceWithPerson, used by both handlers. The list endpoint already does the join, so the PATCH/POST shape now matches it.	2026-04-29 23:38:24 +00:00
Cameron Cordes	43cb60d3ad	faces: re-embed on bbox edit instead of leaving the embedding stale Phase 2 stored the new bbox on PATCH /image/faces/{id} but logged "embedding now stale (Phase 3 will re-embed)" and moved on. That left the embedding column pointing at the old face area while the bbox described a new one — auto-bind cosine similarity and the cluster suggester would silently rank the row as "the same face it was before the edit" forever after, even though the geometry no longer matched. Now: when the PATCH includes a bbox, the handler: 1. Looks up the row to find its photo (library_id + rel_path). 2. Crops the new bbox region with the same crop_image_to_bbox helper manual-create uses (10% pad on each side so the detector has ear/jaw context). 3. POSTs the crop to face_client.embed for a fresh ArcFace vector. 4. Stores both the new bbox AND the new embedding in one update_face transaction. Errors map cleanly: - face_client disabled → 503 (bbox edit needs Apollo). - decode failure / no face in crop → 422. - Apollo CUDA OOM / unavailable → 503 transient. - Underlying row missing → 404. About 100-500ms per edit on CPU, dominated by Apollo's inference call. Acceptable for a manual operator action; the alternative (stale embedding) silently broke the rest of the face stack. Prerequisite for the upcoming carousel-side draw/resize bbox UI — without re-embed, every operator-driven bbox tweak would corrode the clustering/auto-bind quality. ApiPatchFaceBody on Apollo's side already passes bbox through verbatim, so no Apollo change needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 23:10:25 +00:00
Cameron Cordes	7303fb8aa3	faces: ignore/junk bucket — DB schema + lazy-create endpoint A single global "Ignored" person row, marked is_ignored=true, that the frontend lazily creates on first use to hold strangers, false detections, and faces the user doesn't want bound to a real person. Schema (new migration 2026-04-29-000200_add_is_ignored): - persons.is_ignored BOOLEAN NOT NULL DEFAULT 0 - Partial index on (is_ignored) WHERE is_ignored = 1; small WHERE set means a tiny index that only ever services the bucket lookup. Why a real persons row instead of a separate table or status enum: - face_detections.person_id stays a clean foreign key — no special code paths for "ignored faces" anywhere else in the schema. - The cluster-suggester already filters by `person_id IS NULL`, so bound-to-ignored faces are naturally excluded from re-clustering without any change. - merge / rename / delete all work on it with the existing routes (the management UI just hides it from default views). DAO additions / changes: - get_or_create_ignored_person (idempotent; race-safe via the UNIQUE COLLATE NOCASE on persons.name + retry-on-409 fallback). - list_persons gains an include_ignored parameter; default false so the management screen hides the bucket unless asked. - find_persons_by_names_ci filters is_ignored=0 in SQL so the auto-bind path can NEVER target the bucket — even if the user happens to tag photos as "Ignored", the heuristic look-up skips it. Bucket assignment is always an explicit operator action. - update_person accepts is_ignored: Option<bool> so a person can be moved into / out of the bucket without a delete + recreate. Routes: - POST /persons/ignore-bucket — returns the bucket, creating it on first call. Frontend uses this lazily right before binding. - GET /persons gains ?include_ignored=true; default behavior unchanged. - PATCH /persons/{id} now accepts is_ignored. Tests: ignore_bucket_idempotent_and_filters_auto_bind covers the contract: bucket is idempotent across calls, find_persons_by_names_ci skips it (even on exact name match), default list_persons hides it, include_ignored=true surfaces it. All other tests updated to pass the new is_ignored: false / Option<bool> fields explicitly. cargo test --lib: 181/0; fmt + clippy clean for new code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 22:48:16 +00:00
Cameron Cordes	0e160f5d22	faces: include bbox on /faces/embeddings response Apollo's cluster suggester wants to render a face-cropped thumbnail for each cluster's representative — a multi-person photo with the cluster about 'one' of them was unreadable when the thumb showed the whole image. Plumbing bbox through means the UI can crop to the rep face without an extra round-trip per cluster. FaceEmbeddingRow gains bbox_x/y/w/h (Optional<f32>, mirrors the column nullability — for status='detected' rows the CHECK constraint guarantees they're populated, but the type stays nullable as documentation). list_embeddings already loaded these from the underlying FaceDetectionRow; this commit just stops dropping them when constructing the response. No DB changes; no behavior change for existing callers (the new fields are additive). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 21:01:58 +00:00
Cameron Cordes	a24fac5511	faces: backfill missing content_hash from the file watcher Photos indexed before content-hashing landed (or where the hash compute failed silently on insert) end up in image_exif with NULL content_hash. build_face_candidates keys on content_hash, so those rows would never become face candidates without backfill — symptom: face detection logs nothing despite photos being in the library and the watcher running. The dedicated `backfill_hashes` binary already handles this; this commit lets the watcher self-heal during full scans so the deploy 'just works' for face recognition without operator action. Idempotent — subsequent scans see populated hashes and no-op. Bounded per tick by FACE_HASH_BACKFILL_MAX_PER_TICK (default 500) so a watcher tick on a 50k-photo legacy library doesn't blake3 every file in one shot. For very large backlogs the dedicated binary is still faster (no DAO mutex contention with the watcher loop). Only runs when face_client.is_enabled(), so legacy deploys without APOLLO_FACE_API_BASE_URL keep the same behavior. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 20:41:08 +00:00
Cameron Cordes	23f4941471	faces: surface enabled/disabled state + per-tick candidate count Manual deploy debugging: 'Saved thumbnail' logs were visible (boot-time thumbnail backfill) but no face_watch logs were appearing, with no obvious way to tell whether the integration was disabled, hadn't reached a full scan yet, or had simply seen no new files. Two log lines: - watch_files startup: 'Face detection: ENABLED' / 'DISABLED (set APOLLO_FACE_API_BASE_URL or APOLLO_API_BASE_URL to enable)' so you can tell at a glance whether the env wired through. - process_new_files (debug-level): 'face_watch: scan tick — N image file(s) walked, M candidate(s) (library 'main', modified_since=...)' so an empty-candidate scan is distinguishable from a misconfigured or skipped one without bumping log level for the rest of the watcher. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 20:19:17 +00:00

1 2 3 4 5 ...

455 Commits