From 5f247be1f1a6b40133d9b9c0d4f3f69e60bd773a Mon Sep 17 00:00:00 2001 From: Cameron Cordes Date: Fri, 1 May 2026 16:53:08 +0000 Subject: [PATCH] docs(claude): note in-place edit gap as future Branch D MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The maintenance pipeline added in Branch C assumes (library_id, rel_path) bytes are stable for as long as the file lives at that path. In-place edits (crop, re-export to same name) bypass process_new_files's already-indexed check, so the row's content_hash stays pinned to the original bytes — tags / faces / insights remain attached to that hash silently. Document the gap and the proposed shape of the fix: - Stale-content detection pass: compare last_modified / size_bytes to fs::metadata, re-hash on mismatch, update image_exif. - "Content branched" semantics on hash change: faces re-run, tags migrate forward (user intent survives a crop), insights migrate + flag for re-generation, favorites follow path. - Apollo derived.db cache invalidation belongs in the same design cycle, not after. Captured here so the design intent is clear before someone hits the case in real life. No code change. Co-Authored-By: Claude Opus 4.7 (1M context) --- CLAUDE.md | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/CLAUDE.md b/CLAUDE.md index 82a1e62..b6864cd 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -270,6 +270,37 @@ without disappearance flips through pass 1 (lib-A row retired) and pass 2 (back-refs follow), with pass 3 noting nothing because the hash is still present in `image_exif` (lib-B's row). +**Known gap: in-place content changes (future Branch D).** The +maintenance pipeline assumes a `(library_id, rel_path)`'s bytes are +stable for as long as the file exists at that path. If a user edits +a file in place (crop, re-export) without renaming, the watcher's +quick scan walks the file (mtime is recent) but `process_new_files` +short-circuits because `(library_id, rel_path)` already has an +`image_exif` row — no re-hash, no re-EXIF, no face redetection. The +row's `content_hash` keeps pointing at the original bytes. Tags / +faces / insights stay attached to the original hash and continue to +display because the rel_path back-ref still resolves; new faces +introduced by the edit are never detected. + +The right place to fix this is a **stale-content detection pass** +that compares `image_exif.last_modified` / `size_bytes` to +`fs::metadata` for rows the quick scan would otherwise skip. On +mismatch, recompute the hash, update `image_exif`, and apply the +"content branched" semantics: +- **Faces** re-run (faces are fully derived from bytes). +- **Tags** migrate to the new hash (user intent — "this photo is + vacation" survives a crop). Insights migrate forward as a + starting point and are flagged for re-generation. +- **Favorites** (when migrated to hash-keyed) follow the path / + user intent. + +The interesting case is the operator who keeps an unedited copy in +the archive library and edits the local copy: post-detection, the +archive copy stays on the original hash, the local copy branches to +the new hash, and the two histories cleanly split. Apollo's +`derived.db` cache will need an invalidation hook for the changed +hash — design it alongside Branch D. + ### File Processing Pipeline **Thumbnail Generation:**