ImageApi

Author	SHA1	Message	Date
Cameron Cordes	7cd1ea3cf8	hls: per-library readiness gauges + GET /hls/stats endpoint The hash-keyed pipeline transcodes lazily, so a freshly mounted (or freshly upgraded) library is "mostly pending" for the first hour while the watcher works through the backlog. The operator wants a live read on remaining work so they can tune `HLS_CONCURRENCY` and know when to stop waiting. Adds: - `src/hls_stats.rs` — pure compute path (`stats_from_rows`) and an Arc<Mutex<dyn ExifDao>> wrapper (`compute_and_publish`). Per library: `total`, `with_playlist`, `pending`, `unsupported`, `hashless_videos`. Dedup is by content_hash so duplicate-bytes-at- N-paths counts once (same domain rule as `faces::stats`). `hashless_videos` is a separate counter so the operator can see the "hash backfill, then transcode" pipeline depth instead of having NULL-hash rows just hide. - Prometheus gauges labeled by library name: `imageserver_hls_videos_total`, `..._with_playlist`, `..._pending`, `..._unsupported`. Updated by the watcher at the end of every full- scan tick and on every `/hls/stats` hit, so whichever surface the operator is watching stays fresh. Registered in `main` alongside the existing image/video gauges. - `GET /hls/stats` — Claims-protected JSON snapshot of the same data plus a top-level cross-library aggregate. Runs on a blocking pool so it doesn't pin the actix worker; per-call cost is one `list_paths_and_hashes_for_library` SQL query per library plus a `stat()` per distinct video hash. Bounded — never invoked from middleware, only from the explicit endpoint and the full-scan tick. The watcher's end-of-tick `info!` summary line mirrors the endpoint output for operators tailing the log. - New `ExifDao::list_paths_and_hashes_for_library` method: `SELECT rel_path, content_hash FROM image_exif WHERE library_id = ?`. Single round-trip; callers filter to video extensions client-side because the schema doesn't carry media-type. Mock impl in `files.rs` returns an empty vec. Tests in `hls_stats::tests` exercise stats_from_rows directly (videos- only filter, hash dedup, playlist vs sentinel decision, NULL-hash hashless counting) plus a publish_gauges round-trip that reads the gauge value back. Full suite (347 lib + 360 bin = 707) passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 15:58:46 -04:00
Cameron Cordes	b8e17e05b7	hls: rewrite orphan cleanup for hash-keyed layout The cleanup walk previously looked for `$VIDEO_PATH/<basename>.m3u8` and matched each file's stem against a recursive walk of every library. With the hash-keyed layout now in place, every playlist's file_stem is the literal string "playlist" — the old logic would treat every hash-keyed playlist as orphaned on its next run and wipe them all in one tick (default cleanup interval is 24h, so this is a 24-hour bomb on top of the prior commit). New approach: orphan-ness is decided in the database, not on the filesystem. The cleanup loop: - Snapshots every distinct non-NULL `image_exif.content_hash` into a HashSet (new `ExifDao::list_distinct_content_hashes` method — `SELECT DISTINCT content_hash WHERE content_hash IS NOT NULL`). - Walks `$VIDEO_PATH` two levels deep: top-level entries are filtered to 2-char lowercase hex shard dirs, each shard's children to 64-char hex hash dirs. Anything else (legacy `.m3u8` at root from the pre-content-hash era, operator-stashed dirs, partial writes) is left alone. - Hash dirs whose hash isn't in the alive set are `remove_dir_all`'d. Shard dirs that emptied as a result are reaped on the same pass via `remove_dir` (no-op if non-empty). - The library-stale safety gate is preserved: a stale library skips the cycle even though the orphan decision is DB-only, because the upstream missing-file scan that retires `image_exif` rows itself pauses for stale libraries. Belt-and-suspenders — keeping a hash dir for one extra 24h cycle is cheaper than wiping one whose source was briefly unreachable. The gate now also filters disabled libraries out of the stale set (they're intentionally absent from the health map). - The legacy `excluded_dirs` parameter is preserved on the function signature but unused (the walk no longer crosses library trees); flagged with a leading underscore. Callers in `main.rs` stay unchanged. `MockExifDao` in `files.rs` grows the new method (returns empty); unit tests for the new `is_hash_shard` / `is_full_hash` validators guard against an operator's stashed directory under VIDEO_PATH ever matching the orphan-rm path. Both pass. A follow-up commit handles the one-shot startup migration that retires the legacy basename-keyed `.m3u8` / `.ts` files at `$VIDEO_PATH` root. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 15:41:04 -04:00
Cameron Cordes	bec9857426	Split main.rs: extract backfill drains and thumbnails into modules main.rs drops from 3542 → ~2930 lines by moving: - src/backfill.rs (new): backfill_unhashed_backlog, backfill_missing_date_taken, backfill_missing_content_hashes, build_face_candidates, process_face_backlog. Now unit-tested for the first time — 5 tests covering cap behavior, library-id filtering, missing-on-disk skip, and the video/unhashed/scanned filters on face-candidate selection. - src/thumbnails.rs (new): unsupported_thumbnail_sentinel, generate_image_thumbnail, create_thumbnails, update_media_counts, is_image, is_video, plus the IMAGE_GAUGE / VIDEO_GAUGE Prometheus metrics. Replaces the no-op stubs that used to live in lib.rs. 4 new unit tests for the sentinel path math and the walker-counts-images-vs-videos smoke path. Supporting: - SqliteExifDao::from_shared (test-only) so an SqliteExifDao and SqliteFaceDao can share one in-memory connection — required to test build_face_candidates against the real join. - files.rs / video/{mod,actors}.rs import from crate::thumbnails::* instead of the now-removed stubs in lib.rs. cargo test --bin image-api: 325 passing (was 314). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 12:22:02 -04:00
Cameron Cordes	b42acbb3f3	fmt: cargo fmt sweep across drifted files No behavior change — purely whitespace/line-break cleanup that had accumulated since the last format run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 16:42:41 -04:00
Cameron Cordes	832b50d587	image_exif: manual date_taken override (set/clear endpoints) Add `POST /image/exif/date` and `POST /image/exif/date/clear` so an operator can correct a row whose canonical-date waterfall landed on the wrong value (camera clock reset, fs_time fallback for a copied-from- backup file, etc). New `original_date_taken` / `original_date_taken_source` columns snapshot the prior value on first override so revert is lossless. The waterfall source set is now `'exif' \| 'exiftool' \| 'filename' \| 'fs_time' \| 'manual'`. The existing `idx_image_exif_date_backfill` partial index already filters to `date_taken IS NULL OR date_taken_source = 'fs_time'`, so manual rows are naturally excluded from the per-tick drain — no index change needed. `ExifMetadata` now exposes `date_taken_source` + originals so a UI can render "manually set; was X via filename". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 19:26:43 -04:00
Cameron Cordes	7f12890f4b	memories: single-SQL rewrite + 20-year lookback Replaces the EXIF-loop + WalkDir-fallback pipeline that powered `/memories` with a single per-library SQL query (`get_memories_in_window`) that uses `strftime('%m-%d' \| '%W' \| '%m', date_taken, 'unixepoch', tz_offset)` for calendar matching in the client's timezone, plus a `years_back` lower bound and a no-future-dates upper bound. Returns only the matching rows; the handler applies per-library `PathExcluder` post-query and sorts. Drops: - `collect_exif_memories` — replaced by the single SQL query. - `collect_filesystem_memories` — the canonical-date pipeline now populates `date_taken` for every row at ingest, so the WalkDir fallback that scanned 14k+ files each request is no longer needed. - `get_memory_date_with_priority` and friends — request-time waterfall superseded by `date_resolver` running at ingest. The associated three priority-tests are dropped; their replacement lives in `date_resolver::tests`. On a ~14k-file library this drops `/memories` from 10–15 s (dominated by `fs::metadata` per row) to single-digit ms. Bumps `DEFAULT_YEARS_BACK` from 15 → 20 to surface deeper archives on matching anniversaries. Note vs. ISO weeks: the original Rust used `chrono::iso_week().week()` for week-span matching. SQLite's `%W` is Monday-anchored but uses week 0 for days before the first Monday, so it can disagree with ISO at year boundaries by ±1. Acceptable for nostalgia browsing. Adds 3 new DAO tests covering month-span filter, library scoping, and the unknown-span-token guard. Also adds a CLAUDE.md section describing the canonical-date pipeline end-to-end and the new `DATE_BACKFILL_MAX_PER_TICK` env var. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 16:04:09 -04:00
Cameron Cordes	54e0635a98	date_backfill: per-tick drain for unresolved date_taken rows Adds two ExifDao methods (`get_rows_needing_date_backfill` / `backfill_date_taken`) and a `backfill_missing_date_taken` watcher pass that runs on every tick alongside `backfill_unhashed_backlog`. The drain queries the partial index for rows where `date_taken IS NULL` or `date_taken_source = 'fs_time'`, batches up to `DATE_BACKFILL_MAX_PER_TICK` paths (default 500), and feeds them through `date_resolver::resolve_dates_batch` — a single exiftool subprocess covers the whole tick. Rows that newly resolve to `exiftool` / `filename` / `fs_time` get persisted via `backfill_date_taken` (touches only `date_taken` + `date_taken_source` so EXIF / hash / perceptual columns survive). `filename`-sourced rows are intentionally not re-resolved — the regex is authoritative when it matches and re-running exiftool wouldn't change the answer. Files that have disappeared from disk are skipped so a ghost row doesn't loop through the drain forever; the missing-file scan in `library_maintenance` retires those separately. Comes with two DAO unit tests (eligibility filter + column-isolation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 16:03:03 -04:00
Cameron Cordes	84326501a9	image_exif: add date_taken_source column New nullable TEXT column tracks which step of the canonical-date waterfall (kamadak-exif → exiftool → filename → fs_time) populated `date_taken`. Lets a later per-tick drain re-resolve weak sources (`fs_time`) once stronger ones become available, and gives the UI/debug surface a way to answer "why does this photo show up under this date?". Adds the column at all `InsertImageExif` construction sites with `None` placeholders (the resolver wiring lands in a follow-up commit), and extends the `update_exif` SET tuple so the column survives the GPS-write re-read path. Partial index `idx_image_exif_date_backfill` is created for the upcoming drain query. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 15:57:49 -04:00
Cameron Cordes	67cf0c7f73	duplicates: folder-pair view of exact dups Bucket exact-dup rows by (library_id, dirname) pair on each side, then filter by coverage = shared / min(folder_a_total, folder_b_total) and an absolute floor on shared count. Surfaces "this folder is mostly contained in that folder" matches that the per-file EXACT view buries under one row each — e.g. an old phone-backup tree shadowing the organized library, or a topic-grouped folder duplicating a date-grouped one within the same library. New endpoint: GET /duplicates/folder-pairs?library=&include_resolved= &min_coverage=&min_shared=. Cached 5 min keyed on (library, include_resolved); the user-tunable thresholds filter the cached unfiltered pair list so slider drags don't re-bucket. Shares the resolve / unresolve flow with the existing tabs — the frontend fans out N parallel /resolve calls, one per shared content_hash. Folder names carry no signal (BMW lives under Night Photos, not BMW_backup), so bucketing is purely on (library_id, dirname) co-occurrence in exact-dup groups. Within-folder dups (same hash twice in the same folder) are skipped — those belong to the EXACT tab. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 12:43:29 -04:00
Cameron Cordes	7584cd8792	duplicates: perceptual hash + soft-mark resolution + upload 409 Adds pHash + dHash columns alongside the existing blake3 content_hash so near-duplicates (re-encoded, resized, format-converted copies) become queryable. /duplicates/{exact,perceptual} return groups; /duplicates/ {resolve,unresolve} flip a duplicate_of_hash soft-mark on losing rows and union perceptual-only tag sets onto the survivor. The default /photos listing filters duplicate_of_hash IS NULL so demoted siblings stop cluttering the grid; include_duplicates=true opts back in for Apollo's review modal. Upload now hashes bytes pre-write and returns 409 with the canonical sibling when a file's bytes already exist. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 17:36:01 -04:00
Cameron Cordes	263e27e108	multi-library: handoff + orphan GC with two-tick consensus Branch C of the multi-library data-model rollout. Implements the operational maintenance pipeline pinned in CLAUDE.md → "Multi-library data model" / "Library availability and safety". Branches A and B land first; this branch builds on top. New module: src/library_maintenance.rs Three idempotent passes the watcher runs every tick after the per-library ingest loop: 1. Missing-file scan (per online library) For each Online library, load a paginated page of image_exif rows (IMAGE_EXIF_MISSING_SCAN_PAGE_SIZE, default 500), stat() each one, and delete rows whose source file is NotFound. Permission/IO errors are skipped, never deleted. Capped at IMAGE_EXIF_MISSING_DELETE_CAP_PER_TICK (default 200) per library per tick — so a pathological mount that returns NotFound for everything can't wipe the table in one cycle. Cursor advances across ticks, wraps on partial-page returns, and naturally cycles through the entire library over many minutes. Skipped wholesale for Stale libraries via the existing probe gate. 2. Back-ref refresh (DB-only) For face_detections / tagged_photo / photo_insights: any hash-keyed row whose (library_id, rel_path) no longer matches an image_exif row, but whose content_hash does, is repointed at a surviving image_exif location. Pure SQL with EXISTS guards so rows whose hash is fully orphaned are left alone (the orphan GC handles those). Idempotent; no availability gate needed. This is what makes a recent → archive move invisible to readers: when pass 1 retires the lib-A row, pass 2 pivots tags / faces / insights to lib-B's surviving path before any client notices. 3. Orphan GC (destructive) Hash-keyed derived rows whose content_hash has no image_exif referent are GC-eligible. Two-tick consensus: a hash must be observed orphaned on two consecutive ticks AND every library must be Online for both. A single Stale tick within the window cancels all pending deletes (they remain marked but won't be promoted) — they're re-evaluated next tick. The pending set lives in OrphanGcState (in-memory); a watcher restart resets it, which can only delay a delete, never cause one. Hashes that re-appear in image_exif between ticks are "revived" from the pending set (handles transient share unmount / remount). Two new ExifDao methods: - list_rel_paths_for_library_page(library_id, limit, offset) for the paginated missing-file scan. - (count_for_library landed in Branch A.) Watcher wiring (main.rs) Per-library: missing-file scan inside the existing per-library loop, after process_new_files, gated by the same probe check that already protects ingest. After the loop: reconcile (Branch B), back-ref refresh, then run_orphan_gc. The maintenance connection is opened once per tick (image_api::database::connect), used by all three DB-only passes, and dropped at end of tick. CLAUDE.md gains a "Maintenance pipeline" subsection that describes the three passes and their interaction with the existing availability-and-safety policy. Tests: 225 pass (217 from Branch B + 8 new in library_maintenance covering back-ref refresh including the fully-orphaned no-op case, two-tick GC consensus, Stale-tick consensus reset, image_exif re-appearance revival, multi-table delete, and the all_libraries_online helper). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 16:27:53 +00:00
Cameron Cordes	eea1bf3181	multi-library: availability probe + scoped EXIF queries + collision fixes Branch A of the multi-library data-model rollout. Three threads of correctness/safety work that ship together because the new mount needs all three before it can land: 1. Library availability probe (libraries.rs, state.rs, main.rs) New LibraryHealth (Online \| Stale { reason, since }) and a shared LibraryHealthMap on AppState. Probe checks root_path exists + is_dir + readable + non-empty (relative to a "had_data" signal so fresh mounts aren't downgraded). The watcher tick begins with a refresh_health() per library; stale libraries skip ingest, the hash backfill, and face-detection backlog drains for that tick. The orphaned-playlist cleanup also gates on every library being online — a missing source on a stale library is indistinguishable from a transient unmount, and the cleanup is destructive. /libraries now returns each library with its current health state. Logs only on Online↔Stale transitions so a long outage doesn't spam. New ExifDao::count_for_library is the "had_data" signal. 2. EXIF queries scoped by library_id (database/mod.rs, files.rs, main.rs, tags.rs) query_by_exif gains an Option<i32> library filter; /photos and /photos/exif now pass it. Without this, an EXIF-filtered request scoped to ?library=N returned cross-library results because the handler resolved the library but didn't push it through to SQL. get_exif_batch gains the same option. The watcher's per-library ingest, face-candidate build, and content-hash backfill all scope to their library; the union-mode /photos date-sort path and the library-agnostic tag fan-out (lookup_tags_batch, by design) keep using None. 3. Derivative-path collision fixes (content_hash.rs, main.rs) New content_hash::library_scoped_legacy_path helper: <derivative_dir>/<library_id>/<rel_path>. Thumbnail generation (startup walk + watcher needs-thumb check) and serving now use it; serving falls back to the bare-legacy mirrored path so pre-multi-library deployments keep working without regeneration. Without this, lib2 with the same rel_path as lib1 would have its thumbnail request short-circuit to lib1's image. Orphaned-playlist cleanup walks every library when checking for the source video (was: BASE_PATH only). Without this, mounting a 2nd library and waiting 24h would delete every playlist whose source lived only in the 2nd library. The HLS playlist write path collision (filename-only basename, not rel_path) is left as a known issue with a TODO at the call site — the actor-pipeline rewrite belongs in Branch B/C. Tests: 212 pass (cargo test --lib). New tests cover the probe states (online / missing root / non-dir / empty-with-prior-data), refresh_health transitions, query_by_exif scoping, get_exif_batch keying on (library_id, rel_path), library_scoped_legacy_path, and count_for_library. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 14:12:49 +00:00
Cameron Cordes	f50655fb21	indexer: apply EXCLUDED_DIRS to remaining WalkDir callers Audit follow-up to `5bf4956`. The same `@eaDir` pruning that protects the indexer also needs to protect the other walks under library roots: - `create_thumbnails` walks every file in every library to generate thumbnails. Without EXCLUDED_DIRS, it would generate thumbnails of Synology's `SYNOFILE_THUMB_*.jpg` thumbnails (thumbnails of thumbnails). - `update_media_counts` walks for the prometheus IMAGE / VIDEO gauges. Without EXCLUDED_DIRS, the gauges over-count by however many phantom `@eaDir` images live alongside the real photos. - `cleanup_orphaned_playlists` walks BASE_PATH searching for source videos by filename. EXCLUDED_DIRS isn't a behavior change for typical Synology mounts (no .mp4 in @eaDir), but it's a correctness win for any operator-defined exclude that happens to contain video. Refactor: add `walk_library_files(base, excluded_dirs) -> Vec<DirEntry>` to file_scan.rs as the shared primitive. `enumerate_indexable_files` now layers media-type + mtime filters on top of it. One new test covers the lower-level helper (returns all extensions, prunes excluded subtrees). `generate_video_gifs` (currently `#[allow(dead_code)]`, not reachable from main) gets the `update_media_counts` signature update and reads EXCLUDED_DIRS from env so a future revival isn't broken — but its WalkDir walk stays raw because the dual lib/bin compile makes the file_scan module path non-trivial there. Tagged with a comment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 20:21:17 +00:00
Cameron Cordes	6a6a4a6a46	tags: batch lookup expands content-hash siblings cross-library The first cut matched by rel_path only — fine for single-library deploys but wrong for multi-library setups where the same content lives under different rel_paths (e.g. a backup mount holding copies of the primary library). A tag applied under library A would silently not appear in the library-B grid badge even though the carousel's per-path /image/tags would resolve it correctly via siblings. The batch handler now does the expansion server-side in three queries regardless of input size: 1. image_exif batch lookup → query path → content_hash 2. image_exif JOIN by content_hash → all sibling rel_paths sharing each hash (paths are deduped across libraries) 3. tagged_photo + tags JOIN over the union of (query + sibling) rel_paths Tags are then aggregated back to query paths via a sibling→originals reverse map, deduped by tag id. Files without a content_hash (just indexed, hash compute pending, etc.) skip step 2 and only get tags from their own rel_path — same fallback the per-path handler uses. Adds ExifDao::get_rel_paths_for_hashes (batch counterpart of get_rel_paths_by_hash) chunked at 500 to stay under SQLite's SQLITE_LIMIT_VARIABLE_NUMBER. Five queries for a 4k-photo grid is still ~800x cheaper than per-path HTTP fan-out.	2026-04-30 00:36:44 +00:00
Cameron Cordes	7621282419	Thumb orientation + library filter on /photos/exif Two follow-ups on the same feature branch: 1. Bake EXIF orientation into generated thumbnails. The `image` crate doesn't apply Orientation on load, and `save_with_format(..Jpeg)` drops EXIF — so portrait phone shots ended up sideways in any client that displays the cached thumb directly (no EXIF tag for the browser to compensate from). New `exif::read_orientation` reads the tag cheaply (no full EXIF parse) and `exif::apply_orientation` does the rotate/flip via image's existing `rotate90/180/270` + `fliph/flipv`. Applied in both branches of `generate_image_thumbnail` (RAW embedded- JPEG path and the regular `image::open` path). Existing thumbnails in the cache are still wrong-orientation; wipe the thumb dir or run a one-off backfill once this lands. 2. Optional `library` query param on `/photos/exif`. Accepts numeric id or name (same shape as `/image?library=...`), resolved via the existing `resolve_library_param` helper so a bad value 400s before we touch the DAO. Filter is applied post-query in the handler rather than pushed into `query_by_exif` to keep the DAO trait (and its test mocks) unchanged. Cheap enough at typical library counts; can be moved into SQL later if it ever isn't. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 17:29:36 -04:00
Cameron Cordes	c6f82ebaba	Batch EXIF endpoint: GET /photos/exif Adds a single round-trip projection of `image_exif` for every photo whose `date_taken` falls in `[date_from, date_to]`. Wraps the existing `ExifDao::query_by_exif` DAO method which already handles the SQL filter in one query against the covering index — the only missing piece was HTTP plumbing. Designed for window-scoped consumers like Apollo's photo-to-track matcher, which currently does N+1 (one `/photos` listing + one `/image/metadata` per photo). Because `/image/metadata` serializes on `Data<Mutex<dyn ExifDao>>`, that pattern can take 10s+ for windows with hundreds of photos. The new endpoint takes one mutex acquisition for the whole batch. Response shape: { photos: [ { file_path, library_id, library_name, camera_model, width, height, gps_latitude, gps_longitude, date_taken } ], total: N } Two notes on scope: - Photos with NULL `date_taken` are excluded by `query_by_exif`'s semantics. Filename-extracted dates are not synthesized here; rare callers that need that fallback can still hit `/image/metadata`. - GPS columns are stored as f32 in image_exif to keep row size small; the JSON shape widens to f64 so clients don't have to know about the on-disk precision. Library names are pre-mapped from `app_state.libraries` once and stamped on each row, avoiding an O(rows × libraries) linear scan. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 16:38:53 -04:00
Cameron	dc2a96162e	fix(dates): prefer earliest of fs created/modified as fallback On copied or restored files (e.g. a backup library), the OS stamps created at copy time while modified is preserved from the source, so the earlier of the two is a better proxy for when the content originated. Adds utils::earliest_fs_time and threads it through the three spots that fall back to filesystem dates: photos-list sort, memories grouping, and insight-generation timestamp. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 17:20:12 -04:00
Cameron	3027a3ffda	perf: DB-backed recursive /photos + watcher reconciliation Recursive listings now query image_exif instead of walking disk, taking union-mode /photos from ~17s to sub-second on a 10k-file library. The watcher's full scan prunes stale image_exif rows so the DB stays in parity with the filesystem when files are deleted externally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 01:55:07 +00:00
Cameron	b04dd8b601	fix: demote path-not-exists validation errors to debug The /image cross-library fallback tries the resolved library first and falls back to any library holding the rel_path. The first attempt emitted error-level noise on every grid tile in union mode. Split the validation error so only traversal attempts log at error; missing-file cases log at debug. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 01:55:07 +00:00
Cameron	2c8de8dcc6	feat: union /photos and /memories across libraries When `library` is omitted, both endpoints now walk every configured library root, interleave the results, and tag each row with its source library via the parallel `photo_libraries` / per-row `library_id` arrays. Previously the handlers fell back to the primary library, silently hiding the rest. Threads a parallel `file_libraries: Vec<i32>` through the sort/paginate helpers so library attribution survives sorting and pagination. Directory names are de-duplicated across libraries. `get_all_with_date_taken` grows an optional library filter so memories can scope its EXIF query per-library during the union walk. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 01:55:07 +00:00
Cameron	586b735af5	feat: include per-photo library id in /photos response Adds a parallel `photo_libraries: Vec<i32>` array alongside `photos` in `PhotosResponse` so clients can render per-thumbnail badges. Populated with the scoped library id at the two main return sites; left empty for `/favorites` since favorites are library-agnostic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 01:55:07 +00:00
Cameron	c2ee3996be	chore: apply cargo fmt + clippy cleanup across crate Silence forward-looking dead_code on unused DAO modules, annotate individual placeholder items, rewrite tautological assert!(true/false) in token tests as panic! arms, and pick up fmt drift. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 01:55:07 +00:00
Cameron	a0f3bfab5f	fix: validate gps-summary path against every library The /photos/gps-summary handler validated the incoming path against the primary library's root with new_file=false, which requires the path to exist on disk. For a viewer opened on a file from a non-primary library, tapping the GPS link produced activePath = <folder from lib 2>, the primary-only check failed, and the server 400'd — so the map came up empty. Validation here is purely a traversal guard (the DAO does a prefix LIKE against rel_path), so we now accept the path as long as any configured library can resolve it without escaping its root. Also applies cargo fmt drift on files touched this session. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 01:55:07 +00:00
Cameron	7becbc0737	fix: normalize rel_path separators in non-recursive /photos listing On Windows, strip_prefix preserves backslashes, so the non-recursive branch was looking up tags for 'Melissa\img1.jpg' while tagged_photo stores 'Melissa/img1.jpg' — every file was filtered out. Normalize to '/' to match the watcher and populate_knowledge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 01:55:07 +00:00
Cameron	2d942a9926	feat: content-hash-aware tag/insight sharing + library scoping Tags and insights now follow content across libraries via content_hash lookups on the read path, so the same file indexed at different rel_paths in multiple libraries shares its annotations. Recursive tag search scopes hits to the selected library by checking each tagged rel_path against the library's disk (with a content-hash sibling fallback so tags attached under one library's rel_path still match a content-equivalent file in another). The /image and /image/metadata handlers fall back across libraries when the file isn't under the resolved one, so union-mode search results (which carry no library attribution in the response) still serve correctly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-21 01:55:07 +00:00
Cameron	c01a0479b7	fix: honor library param in /image, /photos, /memories The Phase 3 plumbing accepted `library=` but didn't actually route requests through the scoped library once it was resolved. Three concrete bugs surfaced when testing against a second mounted library: - `/image` always resolved paths against AppState.base_path (primary), so thumbnails for non-primary libraries 400'd when their rel_paths didn't exist under primary. Now resolves against the scoped library and defaults to primary when the param is omitted. - `/memories` walked the scoped library correctly but its helper functions hardcoded `library_id: PRIMARY_LIBRARY_ID` on every MemoryItem, causing clients to route thumbnails back to primary regardless of which library the memory actually came from. - `/photos` non-recursive listing delegated to a `RealFileSystem` constructed from AppState.base_path at startup, so walks always hit primary even when `library=2` was passed. The non-primary path now uses list_files against the scoped library's root; primary still goes through FileSystemAccess to preserve the existing test mock plumbing. Also adds `library` to ThumbnailRequest so the /image query param is actually parsed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-21 01:55:07 +00:00
Cameron	0aaea91cc2	feat: add content_hash backfill + register every media file Adds blake3 content hashing as the basis for derivative dedup (thumbnails, HLS) across libraries. Computed inline by the watcher on ingest and by a new `backfill_hashes` binary for historical rows. Key changes: - `content_hash` and `size_bytes` are now populated on new image_exif rows; a new ExifDao surface (`get_rows_missing_hash`, `backfill_content_hash`, `find_by_content_hash`) supports backfill and future hash-keyed lookups. - The watcher now registers every image/video in image_exif, not just files with parseable EXIF. EXIF becomes optional enrichment; videos and other non-EXIF files still get a hashed row. This also makes DB-indexed sort/filter cover the full library. - `/image` thumbnail serve dual-looks up hash-keyed path first, then falls back to the legacy mirrored layout. - Upload flow accepts `?library=` query param + hashes uploaded files. - Store_exif logs the underlying Diesel error on insert failure so constraint violations surface instead of hiding behind a generic InsertError. - New migration normalizes rel_path separators to forward slash across all tables, deduplicating any rows that collide after normalization. Fixes spurious UNIQUE violations from mixed backslash/forward-slash paths on Windows ingest. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-21 01:55:07 +00:00
Cameron	ce5b337582	feat: make file watcher, thumbnails, and upload library-aware `watch_files` and `create_thumbnails` now iterate every configured library, tagging rows with the correct `library_id`. `process_new_files` takes a `&Library` so InsertImageExif no longer hardcodes the primary library. Upload accepts an optional `library` query param to pick a target library; omitted still defaults to primary for backwards compatibility. Hash-keyed thumbnail/HLS storage with dual-lookup fallback is deferred to Phase 5, where it's bundled with the content hash backfill that actually makes the hash-keyed paths meaningful. Until hashes are populated, the legacy mirrored layout is a no-op to change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-21 01:55:07 +00:00
Cameron	48e5de6eab	feat: add GET /libraries and library query param plumbing New `/libraries` endpoint returns configured libraries so clients can discover them. `FilesRequest` and `MemoriesRequest` gain an optional `library` param (accepts name or numeric id). Unknown values are rejected with 400; absent values span all libraries. `/memories` now scopes its filesystem walk + EXIF query to the resolved library. `MemoryItem` carries `library_id` so union-mode clients can render a per-item source badge. Behavior is unchanged in single-library mode: omitting `library` still returns results from the primary library, which is the only one configured until a second row is added to the libraries table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-21 01:55:07 +00:00
Cameron	ffcddbb843	feat: multi-library foundation (schema + libraries module) Adds a `libraries` registry table and threads library_id through per-instance metadata tables (image_exif, photo_insights, entity_photo_links, video_preview_clips). File-path columns renamed to rel_path to make the relative-to-root semantics explicit. Adds content_hash + size_bytes on image_exif to support future hash-keyed thumbnail/HLS dedup. Tags and favorites stay library-agnostic so they share across libraries by rel_path. Behavior is unchanged: a single primary library (id=1) is seeded from BASE_PATH on first boot; all handlers and DAOs route through it as a transitional shim until the API gains a library query param. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-21 01:55:07 +00:00
Cameron	da16fddce3	Address path traversal and other security fixes	2026-04-10 14:58:57 -04:00
Cameron	da039bbc49	fix: include files without EXIF when sorting by date Date sorting previously used a DB-level query that acted as an inner join, silently dropping files with no image_exif row. Replace it with the existing in-memory sort which already falls back to filename-extracted and filesystem dates, so all files appear in sorted results. Also removes the now-unused get_files_sorted_by_date trait method and its SqliteExifDao implementation and test mock. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-07 14:43:26 -04:00
Cameron	c1b6013412	chore: cargo fmt + clippy fix for collapsed if-let chain (T017) - cargo fmt applied across all modified source files - Collapse nested if let Some / if !is_empty into a single let-chain (clippy::collapsible_match) - All other warnings are pre-existing dead-code lint on unused trait methods Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 23:09:58 -04:00
Cameron	8ecd3c6cf8	refactor: use Arc<Mutex<SqliteConnection>> in SqliteTagDao, remove unsafe impl Sync Aligns SqliteTagDao with the pattern used by SqliteExifDao and SqliteInsightDao. The unsafe impl Sync workaround is no longer needed since Arc<Mutex<>> provides safe interior mutability and automatic Sync derivation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 17:10:11 -04:00
Cameron	7a0da1ab4a	Build insight title from generated summary	2026-02-24 16:08:25 -05:00
Cameron	1efdd02eda	Add GPS summary sorting Run cargo fmt/clippy	2026-01-28 10:52:17 -05:00
Cameron	1d2f4e3441	Add circular thumbnail creation for Map view	2026-01-26 20:04:14 -05:00
Cameron	073b5ed418	Added gps-summary endpoint for Map integration	2026-01-26 11:58:24 -05:00
Cameron	be483c9c1a	Add database optimizations for photo search and pagination Implement database-level sorting with composite indexes for efficient date and tag queries. Add pagination metadata support and optimize tag count queries using batch processing.	2026-01-18 19:17:10 -05:00
Cameron	af35a996a3	Cleanup unused message embedding code Fixup some warnings	2026-01-14 13:33:36 -05:00
Cameron	e2d6cd7258	Run clippy fix	2026-01-14 13:17:58 -05:00
Cameron	f65f4efde8	Make date parse from metadata a little more consistent	2026-01-14 12:54:36 -05:00
Cameron	fa600f1c2c	Fallback to sorting by Metadata date	2026-01-11 14:39:50 -05:00
Cameron	d86b2c3746	Add Google Takeout data import infrastructure Implements Phase 1 & 2 of Google Takeout RAG integration: - Database migrations for calendar_events, location_history, search_history - DAO implementations with hybrid time + semantic search - Parsers for .ics, JSON, and HTML Google Takeout formats - Import utilities with batch insert optimization Features: - CalendarEventDao: Hybrid time-range + semantic search for events - LocationHistoryDao: GPS proximity with Haversine distance calculation - SearchHistoryDao: Semantic-first search (queries are embedding-rich) - Batch inserts for performance (1M+ records in minutes vs hours) - OpenTelemetry tracing for all database operations Import utilities: - import_calendar: Parse .ics with optional embedding generation - import_location_history: High-volume GPS data with batch inserts - import_search_history: Always generates embeddings for semantic search 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-05 14:50:49 -05:00
Cameron	1171f19845	Create Insight Generation Feature Added integration with Messages API and Ollama	2026-01-03 10:30:37 -05:00
Cameron	54e23a29b3	Fix warnings	2025-12-29 14:29:29 -05:00
Cameron	ccd16ba987	Files endpoint refactoring	2025-12-26 22:20:01 -05:00
Cameron	c035678162	Add tracing to EXIF DAO methods	2025-12-23 22:57:24 -05:00
Cameron	636701a69e	Refactor file type checking for better consistency Fix tests	2025-12-23 22:30:53 -05:00
Cameron	3a64b30621	Fix Date sorting in tagged/recursive search	2025-12-23 22:07:40 -05:00

1 2 3

121 Commits