docs(claude): note in-place edit gap as future Branch D
The maintenance pipeline added in Branch C assumes (library_id,
rel_path) bytes are stable for as long as the file lives at that
path. In-place edits (crop, re-export to same name) bypass
process_new_files's already-indexed check, so the row's
content_hash stays pinned to the original bytes — tags / faces /
insights remain attached to that hash silently.
Document the gap and the proposed shape of the fix:
- Stale-content detection pass: compare last_modified / size_bytes
to fs::metadata, re-hash on mismatch, update image_exif.
- "Content branched" semantics on hash change: faces re-run, tags
migrate forward (user intent survives a crop), insights migrate
+ flag for re-generation, favorites follow path.
- Apollo derived.db cache invalidation belongs in the same design
cycle, not after.
Captured here so the design intent is clear before someone hits the
case in real life. No code change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
31
CLAUDE.md
31
CLAUDE.md
@@ -270,6 +270,37 @@ without disappearance flips through pass 1 (lib-A row retired) and
|
|||||||
pass 2 (back-refs follow), with pass 3 noting nothing because the
|
pass 2 (back-refs follow), with pass 3 noting nothing because the
|
||||||
hash is still present in `image_exif` (lib-B's row).
|
hash is still present in `image_exif` (lib-B's row).
|
||||||
|
|
||||||
|
**Known gap: in-place content changes (future Branch D).** The
|
||||||
|
maintenance pipeline assumes a `(library_id, rel_path)`'s bytes are
|
||||||
|
stable for as long as the file exists at that path. If a user edits
|
||||||
|
a file in place (crop, re-export) without renaming, the watcher's
|
||||||
|
quick scan walks the file (mtime is recent) but `process_new_files`
|
||||||
|
short-circuits because `(library_id, rel_path)` already has an
|
||||||
|
`image_exif` row — no re-hash, no re-EXIF, no face redetection. The
|
||||||
|
row's `content_hash` keeps pointing at the original bytes. Tags /
|
||||||
|
faces / insights stay attached to the original hash and continue to
|
||||||
|
display because the rel_path back-ref still resolves; new faces
|
||||||
|
introduced by the edit are never detected.
|
||||||
|
|
||||||
|
The right place to fix this is a **stale-content detection pass**
|
||||||
|
that compares `image_exif.last_modified` / `size_bytes` to
|
||||||
|
`fs::metadata` for rows the quick scan would otherwise skip. On
|
||||||
|
mismatch, recompute the hash, update `image_exif`, and apply the
|
||||||
|
"content branched" semantics:
|
||||||
|
- **Faces** re-run (faces are fully derived from bytes).
|
||||||
|
- **Tags** migrate to the new hash (user intent — "this photo is
|
||||||
|
vacation" survives a crop). Insights migrate forward as a
|
||||||
|
starting point and are flagged for re-generation.
|
||||||
|
- **Favorites** (when migrated to hash-keyed) follow the path /
|
||||||
|
user intent.
|
||||||
|
|
||||||
|
The interesting case is the operator who keeps an unedited copy in
|
||||||
|
the archive library and edits the local copy: post-detection, the
|
||||||
|
archive copy stays on the original hash, the local copy branches to
|
||||||
|
the new hash, and the two histories cleanly split. Apollo's
|
||||||
|
`derived.db` cache will need an invalidation hook for the changed
|
||||||
|
hash — design it alongside Branch D.
|
||||||
|
|
||||||
### File Processing Pipeline
|
### File Processing Pipeline
|
||||||
|
|
||||||
**Thumbnail Generation:**
|
**Thumbnail Generation:**
|
||||||
|
|||||||
Reference in New Issue
Block a user