feature/canonical-date-taken #76
Reference in New Issue
Block a user
Delete Branch "feature/canonical-date-taken"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
date_takenresolution at ingest via a four-step waterfall (kamadak-exif → exiftool → filename → fs_time), with provenance tracked in a newimage_exif.date_taken_sourcecolumn.fs_time-sourced; one exiftool subprocess per batch keeps cost flat for legacy backfills./memoriesas a single SQL query against the new column. ~14k-file libraries drop from 10–15 s → single-digit ms. Default lookback bumped 15 → 20 years.Behavior change worth flagging
EXIF now beats filename when both are present. A photo named
Screenshot_2014-06-01.pngwith EXIFDateTimeof 2021 now appears under 2021, not 2014. The reverse case (no EXIF, parseable filename) isunchanged.
Test plan
cargo fmt && cargo clippy --all-targetscleanDATABASE_URL=:memory: BASE_PATH=/tmp SECRET_KEY=test cargo test --lib— 259/259 pass/memorieslatency on the 14k-file librarydate_taken_sourcepopulates accordinglyReplaces the EXIF-loop + WalkDir-fallback pipeline that powered `/memories` with a single per-library SQL query (`get_memories_in_window`) that uses `strftime('%m-%d' | '%W' | '%m', date_taken, 'unixepoch', tz_offset)` for calendar matching in the client's timezone, plus a `years_back` lower bound and a no-future-dates upper bound. Returns only the matching rows; the handler applies per-library `PathExcluder` post-query and sorts. Drops: - `collect_exif_memories` — replaced by the single SQL query. - `collect_filesystem_memories` — the canonical-date pipeline now populates `date_taken` for every row at ingest, so the WalkDir fallback that scanned 14k+ files each request is no longer needed. - `get_memory_date_with_priority` and friends — request-time waterfall superseded by `date_resolver` running at ingest. The associated three priority-tests are dropped; their replacement lives in `date_resolver::tests`. On a ~14k-file library this drops `/memories` from 10–15 s (dominated by `fs::metadata` per row) to single-digit ms. Bumps `DEFAULT_YEARS_BACK` from 15 → 20 to surface deeper archives on matching anniversaries. Note vs. ISO weeks: the original Rust used `chrono::iso_week().week()` for week-span matching. SQLite's `%W` is Monday-anchored but uses week 0 for days before the first Monday, so it can disagree with ISO at year boundaries by ±1. Acceptable for nostalgia browsing. Adds 3 new DAO tests covering month-span filter, library scoping, and the unknown-span-token guard. Also adds a CLAUDE.md section describing the canonical-date pipeline end-to-end and the new `DATE_BACKFILL_MAX_PER_TICK` env var. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>