feat: add content_hash backfill + register every media file

Adds blake3 content hashing as the basis for derivative dedup
(thumbnails, HLS) across libraries. Computed inline by the watcher on
ingest and by a new `backfill_hashes` binary for historical rows.

Key changes:
- `content_hash` and `size_bytes` are now populated on new image_exif
  rows; a new ExifDao surface (`get_rows_missing_hash`,
  `backfill_content_hash`, `find_by_content_hash`) supports backfill and
  future hash-keyed lookups.
- The watcher now registers every image/video in image_exif, not just
  files with parseable EXIF. EXIF becomes optional enrichment; videos
  and other non-EXIF files still get a hashed row. This also makes
  DB-indexed sort/filter cover the full library.
- `/image` thumbnail serve dual-looks up hash-keyed path first, then
  falls back to the legacy mirrored layout.
- Upload flow accepts `?library=` query param + hashes uploaded files.
- Store_exif logs the underlying Diesel error on insert failure so
  constraint violations surface instead of hiding behind a generic
  InsertError.
- New migration normalizes rel_path separators to forward slash across
  all tables, deduplicating any rows that collide after normalization.
  Fixes spurious UNIQUE violations from mixed backslash/forward-slash
  paths on Windows ingest.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Cameron
2026-04-17 16:25:39 -04:00
parent 561e261d39
commit 524f00b068
11 changed files with 681 additions and 69 deletions

View File

@@ -67,7 +67,7 @@ fn main() -> anyhow::Result<()> {
let context = opentelemetry::Context::new();
let relative_path = match path.strip_prefix(&base) {
Ok(p) => p.to_str().unwrap().to_string(),
Ok(p) => p.to_str().unwrap().replace('\\', "/"),
Err(_) => {
eprintln!(
"Error: Could not create relative path for {}",