faces: count distinct content_hash in stats total_photos #65
Reference in New Issue
Block a user
Delete Branch "face-stats-dedup-hash"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
face_detections is keyed on content_hash (one row per unique bytes,
shared across libraries / duplicate paths) but total_photos was
COUNT(*) over image_exif rows. A file present at multiple rel_paths or
across libraries inflated the denominator without inflating the
numerator, leaving a permanent gap (e.g. 1101/1103 with nothing
actually pending detection).
Switch total_photos to COUNT(DISTINCT content_hash) so numerator and
denominator live in the same domain. Exclude rows with NULL
content_hash from the count — they're held in the hash-backfill
backlog, not the detection backlog, and counting them pins the bar
below 100% for the duration of that pass.
CLAUDE.md: document the stats domain rule next to the rest of the
face-detection notes.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com