date_backfill: per-tick drain for unresolved date_taken rows

Adds two ExifDao methods (`get_rows_needing_date_backfill` /
`backfill_date_taken`) and a `backfill_missing_date_taken` watcher pass
that runs on every tick alongside `backfill_unhashed_backlog`.

The drain queries the partial index for rows where `date_taken IS NULL`
or `date_taken_source = 'fs_time'`, batches up to
`DATE_BACKFILL_MAX_PER_TICK` paths (default 500), and feeds them through
`date_resolver::resolve_dates_batch` — a single exiftool subprocess
covers the whole tick. Rows that newly resolve to `exiftool` /
`filename` / `fs_time` get persisted via `backfill_date_taken` (touches
only `date_taken` + `date_taken_source` so EXIF / hash / perceptual
columns survive).

`filename`-sourced rows are intentionally not re-resolved — the regex
is authoritative when it matches and re-running exiftool wouldn't
change the answer. Files that have disappeared from disk are skipped
so a ghost row doesn't loop through the drain forever; the
missing-file scan in `library_maintenance` retires those separately.

Comes with two DAO unit tests (eligibility filter + column-isolation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-05-06 16:03:03 -04:00
parent 2d14291733
commit 54e0635a98
3 changed files with 291 additions and 0 deletions

View File

@@ -1646,6 +1646,26 @@ mod tests {
Ok(())
}
fn get_rows_needing_date_backfill(
&mut self,
_context: &opentelemetry::Context,
_library_id: i32,
_limit: i64,
) -> Result<Vec<(i32, String)>, DbError> {
Ok(Vec::new())
}
fn backfill_date_taken(
&mut self,
_context: &opentelemetry::Context,
_library_id: i32,
_rel_path: &str,
_date_taken: i64,
_source: &str,
) -> Result<(), DbError> {
Ok(())
}
fn find_by_content_hash(
&mut self,
_context: &opentelemetry::Context,