The first cut matched by rel_path only — fine for single-library
deploys but wrong for multi-library setups where the same content
lives under different rel_paths (e.g. a backup mount holding copies
of the primary library). A tag applied under library A would silently
not appear in the library-B grid badge even though the carousel's
per-path /image/tags would resolve it correctly via siblings.
The batch handler now does the expansion server-side in three queries
regardless of input size:
1. image_exif batch lookup → query path → content_hash
2. image_exif JOIN by content_hash → all sibling rel_paths sharing
each hash (paths are deduped across libraries)
3. tagged_photo + tags JOIN over the union of (query + sibling)
rel_paths
Tags are then aggregated back to query paths via a sibling→originals
reverse map, deduped by tag id. Files without a content_hash (just
indexed, hash compute pending, etc.) skip step 2 and only get tags
from their own rel_path — same fallback the per-path handler uses.
Adds ExifDao::get_rel_paths_for_hashes (batch counterpart of
get_rel_paths_by_hash) chunked at 500 to stay under SQLite's
SQLITE_LIMIT_VARIABLE_NUMBER. Five queries for a 4k-photo grid is
still ~800x cheaper than per-path HTTP fan-out.
Two follow-ups on the same feature branch:
1. Bake EXIF orientation into generated thumbnails. The `image` crate
doesn't apply Orientation on load, and `save_with_format(..Jpeg)`
drops EXIF — so portrait phone shots ended up sideways in any client
that displays the cached thumb directly (no EXIF tag for the browser
to compensate from). New `exif::read_orientation` reads the tag
cheaply (no full EXIF parse) and `exif::apply_orientation` does the
rotate/flip via image's existing `rotate90/180/270` + `fliph/flipv`.
Applied in both branches of `generate_image_thumbnail` (RAW embedded-
JPEG path and the regular `image::open` path). Existing thumbnails
in the cache are still wrong-orientation; wipe the thumb dir or run
a one-off backfill once this lands.
2. Optional `library` query param on `/photos/exif`. Accepts numeric id
or name (same shape as `/image?library=...`), resolved via the
existing `resolve_library_param` helper so a bad value 400s before
we touch the DAO. Filter is applied post-query in the handler
rather than pushed into `query_by_exif` to keep the DAO trait
(and its test mocks) unchanged. Cheap enough at typical library
counts; can be moved into SQL later if it ever isn't.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a single round-trip projection of `image_exif` for every photo whose
`date_taken` falls in `[date_from, date_to]`. Wraps the existing
`ExifDao::query_by_exif` DAO method which already handles the SQL filter
in one query against the covering index — the only missing piece was
HTTP plumbing.
Designed for window-scoped consumers like Apollo's photo-to-track
matcher, which currently does N+1 (one `/photos` listing + one
`/image/metadata` per photo). Because `/image/metadata` serializes on
`Data<Mutex<dyn ExifDao>>`, that pattern can take 10s+ for windows with
hundreds of photos. The new endpoint takes one mutex acquisition for
the whole batch.
Response shape:
{ photos: [
{ file_path, library_id, library_name,
camera_model, width, height,
gps_latitude, gps_longitude, date_taken } ],
total: N }
Two notes on scope:
- Photos with NULL `date_taken` are excluded by `query_by_exif`'s
semantics. Filename-extracted dates are not synthesized here; rare
callers that need that fallback can still hit `/image/metadata`.
- GPS columns are stored as f32 in image_exif to keep row size small;
the JSON shape widens to f64 so clients don't have to know about the
on-disk precision.
Library names are pre-mapped from `app_state.libraries` once and
stamped on each row, avoiding an O(rows × libraries) linear scan.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On copied or restored files (e.g. a backup library), the OS stamps
created at copy time while modified is preserved from the source, so
the earlier of the two is a better proxy for when the content
originated. Adds utils::earliest_fs_time and threads it through the
three spots that fall back to filesystem dates: photos-list sort,
memories grouping, and insight-generation timestamp.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Recursive listings now query image_exif instead of walking disk, taking
union-mode /photos from ~17s to sub-second on a 10k-file library. The
watcher's full scan prunes stale image_exif rows so the DB stays in
parity with the filesystem when files are deleted externally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The /image cross-library fallback tries the resolved library first and falls
back to any library holding the rel_path. The first attempt emitted error-level
noise on every grid tile in union mode. Split the validation error so only
traversal attempts log at error; missing-file cases log at debug.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When `library` is omitted, both endpoints now walk every configured
library root, interleave the results, and tag each row with its source
library via the parallel `photo_libraries` / per-row `library_id`
arrays. Previously the handlers fell back to the primary library,
silently hiding the rest.
Threads a parallel `file_libraries: Vec<i32>` through the sort/paginate
helpers so library attribution survives sorting and pagination.
Directory names are de-duplicated across libraries.
`get_all_with_date_taken` grows an optional library filter so memories
can scope its EXIF query per-library during the union walk.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a parallel `photo_libraries: Vec<i32>` array alongside `photos`
in `PhotosResponse` so clients can render per-thumbnail badges.
Populated with the scoped library id at the two main return sites;
left empty for `/favorites` since favorites are library-agnostic.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Silence forward-looking dead_code on unused DAO modules, annotate
individual placeholder items, rewrite tautological assert!(true/false)
in token tests as panic! arms, and pick up fmt drift.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The /photos/gps-summary handler validated the incoming path against
the primary library's root with new_file=false, which requires the
path to exist on disk. For a viewer opened on a file from a
non-primary library, tapping the GPS link produced activePath =
<folder from lib 2>, the primary-only check failed, and the server
400'd — so the map came up empty.
Validation here is purely a traversal guard (the DAO does a prefix
LIKE against rel_path), so we now accept the path as long as any
configured library can resolve it without escaping its root.
Also applies cargo fmt drift on files touched this session.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On Windows, strip_prefix preserves backslashes, so the non-recursive
branch was looking up tags for 'Melissa\img1.jpg' while tagged_photo
stores 'Melissa/img1.jpg' — every file was filtered out. Normalize to
'/' to match the watcher and populate_knowledge.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tags and insights now follow content across libraries via content_hash
lookups on the read path, so the same file indexed at different rel_paths
in multiple libraries shares its annotations. Recursive tag search scopes
hits to the selected library by checking each tagged rel_path against
the library's disk (with a content-hash sibling fallback so tags attached
under one library's rel_path still match a content-equivalent file in
another). The /image and /image/metadata handlers fall back across
libraries when the file isn't under the resolved one, so union-mode
search results (which carry no library attribution in the response)
still serve correctly.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The Phase 3 plumbing accepted `library=` but didn't actually route
requests through the scoped library once it was resolved. Three
concrete bugs surfaced when testing against a second mounted library:
- `/image` always resolved paths against AppState.base_path (primary),
so thumbnails for non-primary libraries 400'd when their rel_paths
didn't exist under primary. Now resolves against the scoped library
and defaults to primary when the param is omitted.
- `/memories` walked the scoped library correctly but its helper
functions hardcoded `library_id: PRIMARY_LIBRARY_ID` on every
MemoryItem, causing clients to route thumbnails back to primary
regardless of which library the memory actually came from.
- `/photos` non-recursive listing delegated to a `RealFileSystem`
constructed from AppState.base_path at startup, so walks always
hit primary even when `library=2` was passed. The non-primary
path now uses list_files against the scoped library's root;
primary still goes through FileSystemAccess to preserve the
existing test mock plumbing.
Also adds `library` to ThumbnailRequest so the /image query param
is actually parsed.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds blake3 content hashing as the basis for derivative dedup
(thumbnails, HLS) across libraries. Computed inline by the watcher on
ingest and by a new `backfill_hashes` binary for historical rows.
Key changes:
- `content_hash` and `size_bytes` are now populated on new image_exif
rows; a new ExifDao surface (`get_rows_missing_hash`,
`backfill_content_hash`, `find_by_content_hash`) supports backfill and
future hash-keyed lookups.
- The watcher now registers every image/video in image_exif, not just
files with parseable EXIF. EXIF becomes optional enrichment; videos
and other non-EXIF files still get a hashed row. This also makes
DB-indexed sort/filter cover the full library.
- `/image` thumbnail serve dual-looks up hash-keyed path first, then
falls back to the legacy mirrored layout.
- Upload flow accepts `?library=` query param + hashes uploaded files.
- Store_exif logs the underlying Diesel error on insert failure so
constraint violations surface instead of hiding behind a generic
InsertError.
- New migration normalizes rel_path separators to forward slash across
all tables, deduplicating any rows that collide after normalization.
Fixes spurious UNIQUE violations from mixed backslash/forward-slash
paths on Windows ingest.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`watch_files` and `create_thumbnails` now iterate every configured
library, tagging rows with the correct `library_id`. `process_new_files`
takes a `&Library` so InsertImageExif no longer hardcodes the primary
library. Upload accepts an optional `library` query param to pick a
target library; omitted still defaults to primary for backwards
compatibility.
Hash-keyed thumbnail/HLS storage with dual-lookup fallback is deferred
to Phase 5, where it's bundled with the content hash backfill that
actually makes the hash-keyed paths meaningful. Until hashes are
populated, the legacy mirrored layout is a no-op to change.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
New `/libraries` endpoint returns configured libraries so clients can
discover them. `FilesRequest` and `MemoriesRequest` gain an optional
`library` param (accepts name or numeric id). Unknown values are
rejected with 400; absent values span all libraries. `/memories`
now scopes its filesystem walk + EXIF query to the resolved library.
`MemoryItem` carries `library_id` so union-mode clients can render a
per-item source badge.
Behavior is unchanged in single-library mode: omitting `library` still
returns results from the primary library, which is the only one
configured until a second row is added to the libraries table.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a `libraries` registry table and threads library_id through
per-instance metadata tables (image_exif, photo_insights,
entity_photo_links, video_preview_clips). File-path columns renamed to
rel_path to make the relative-to-root semantics explicit. Adds
content_hash + size_bytes on image_exif to support future hash-keyed
thumbnail/HLS dedup. Tags and favorites stay library-agnostic so they
share across libraries by rel_path.
Behavior is unchanged: a single primary library (id=1) is seeded from
BASE_PATH on first boot; all handlers and DAOs route through it as a
transitional shim until the API gains a library query param.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Date sorting previously used a DB-level query that acted as an inner join,
silently dropping files with no image_exif row. Replace it with the existing
in-memory sort which already falls back to filename-extracted and filesystem
dates, so all files appear in sorted results.
Also removes the now-unused get_files_sorted_by_date trait method and its
SqliteExifDao implementation and test mock.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- cargo fmt applied across all modified source files
- Collapse nested if let Some / if !is_empty into a single let-chain (clippy::collapsible_match)
- All other warnings are pre-existing dead-code lint on unused trait methods
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Aligns SqliteTagDao with the pattern used by SqliteExifDao and SqliteInsightDao.
The unsafe impl Sync workaround is no longer needed since Arc<Mutex<>> provides
safe interior mutability and automatic Sync derivation.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implement database-level sorting with composite indexes for efficient date and tag queries. Add pagination metadata support and optimize tag count queries using batch processing.
Implements Phase 1 & 2 of Google Takeout RAG integration:
- Database migrations for calendar_events, location_history, search_history
- DAO implementations with hybrid time + semantic search
- Parsers for .ics, JSON, and HTML Google Takeout formats
- Import utilities with batch insert optimization
Features:
- CalendarEventDao: Hybrid time-range + semantic search for events
- LocationHistoryDao: GPS proximity with Haversine distance calculation
- SearchHistoryDao: Semantic-first search (queries are embedding-rich)
- Batch inserts for performance (1M+ records in minutes vs hours)
- OpenTelemetry tracing for all database operations
Import utilities:
- import_calendar: Parse .ics with optional embedding generation
- import_location_history: High-volume GPS data with batch inserts
- import_search_history: Always generates embeddings for semantic search
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implements foundation for EXIF-based photo search capabilities:
- Add geo.rs module with GPS distance calculations (Haversine + bounding box)
- Extend FilesRequest with EXIF search parameters (camera, GPS, date, media type)
- Add MediaType enum and DateTakenAsc/DateTakenDesc sort options
- Create date_taken index migration for efficient date queries
- Implement ExifDao methods: get_exif_batch, query_by_exif, get_camera_makes
- Add FileWithMetadata struct for date-aware sorting
- Implement date sorting with filename extraction fallback
- Make extract_date_from_filename public for reuse
Next: Integrate EXIF filtering into list_photos() and enhance get_all_tags()