clip-search: migration + client + probe binary

Probe-phase scaffolding for CLIP semantic search. Adds the column
that will hold per-photo embeddings, the HTTP client to Apollo's
inference service, and a throwaway probe binary so we can eyeball
search-result quality on the live library before building the
persistence layer (backlog drain, /photos/search endpoint, UI).

- migrations/2026-05-14-000000_add_clip_embedding/ — adds
  image_exif.clip_embedding (BLOB) and clip_model_version (TEXT),
  plus a partial index on (clip_embedding IS NULL AND content_hash
  IS NOT NULL) for the future backfill drain.
- src/database/models.rs — extends ImageExif struct to match.
- src/ai/clip_client.rs — encode_image / encode_text / health,
  same Permanent/Transient/Disabled taxonomy as face_client.
- src/bin/probe_clip_search.rs — --query <q> --library N --limit M
  --top K. Encodes a sample and prints top-K cosine similarities.
  No DB writes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-05-14 12:54:07 -04:00
parent 26ffc15c8b
commit 8d9e76cf15
9 changed files with 711 additions and 0 deletions

View File

@@ -0,0 +1,27 @@
-- CLIP semantic photo search: store a per-photo image embedding so
-- text queries can rerank against the live library via cosine
-- similarity. Apollo encodes the bytes via its CLIP service; ImageApi
-- writes the resulting blob here.
--
-- `clip_embedding` is the raw little-endian float32 buffer of an
-- L2-normalized vector (dim depends on the model — 768 bytes×4 for
-- ViT-L/14, 512 bytes×4 for ViT-B/32). Apollo always returns the
-- normalized form so the search-time dot product reduces to a plain
-- cosine similarity.
--
-- `clip_model_version` echoes the upstream `APOLLO_CLIP_MODEL` (e.g.
-- "ViT-L/14"). A model swap shouldn't silently mix geometries — the
-- backfill drain will re-eligibilize rows whose stored model_version
-- differs from the live engine's, and the search route refuses to
-- mix rows from two model_versions in the same response.
ALTER TABLE image_exif ADD COLUMN clip_embedding BLOB;
ALTER TABLE image_exif ADD COLUMN clip_model_version TEXT;
-- Partial index for the backfill drain. Mirrors the shape of
-- `idx_image_exif_date_backfill`: candidate rows are those with a
-- known content_hash (so we don't race the unhashed backlog) but no
-- embedding yet. SELECT cost stays O(missing rows) instead of full
-- table scan once the column is mostly populated.
CREATE INDEX IF NOT EXISTS idx_image_exif_clip_backfill
ON image_exif (id)
WHERE clip_embedding IS NULL AND content_hash IS NOT NULL;