Probe-phase scaffolding for CLIP semantic search. Adds the column that will hold per-photo embeddings, the HTTP client to Apollo's inference service, and a throwaway probe binary so we can eyeball search-result quality on the live library before building the persistence layer (backlog drain, /photos/search endpoint, UI). - migrations/2026-05-14-000000_add_clip_embedding/ — adds image_exif.clip_embedding (BLOB) and clip_model_version (TEXT), plus a partial index on (clip_embedding IS NULL AND content_hash IS NOT NULL) for the future backfill drain. - src/database/models.rs — extends ImageExif struct to match. - src/ai/clip_client.rs — encode_image / encode_text / health, same Permanent/Transient/Disabled taxonomy as face_client. - src/bin/probe_clip_search.rs — --query <q> --library N --limit M --top K. Encodes a sample and prints top-K cosine similarities. No DB writes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
28 lines
1.4 KiB
SQL
28 lines
1.4 KiB
SQL
-- CLIP semantic photo search: store a per-photo image embedding so
|
||
-- text queries can rerank against the live library via cosine
|
||
-- similarity. Apollo encodes the bytes via its CLIP service; ImageApi
|
||
-- writes the resulting blob here.
|
||
--
|
||
-- `clip_embedding` is the raw little-endian float32 buffer of an
|
||
-- L2-normalized vector (dim depends on the model — 768 bytes×4 for
|
||
-- ViT-L/14, 512 bytes×4 for ViT-B/32). Apollo always returns the
|
||
-- normalized form so the search-time dot product reduces to a plain
|
||
-- cosine similarity.
|
||
--
|
||
-- `clip_model_version` echoes the upstream `APOLLO_CLIP_MODEL` (e.g.
|
||
-- "ViT-L/14"). A model swap shouldn't silently mix geometries — the
|
||
-- backfill drain will re-eligibilize rows whose stored model_version
|
||
-- differs from the live engine's, and the search route refuses to
|
||
-- mix rows from two model_versions in the same response.
|
||
ALTER TABLE image_exif ADD COLUMN clip_embedding BLOB;
|
||
ALTER TABLE image_exif ADD COLUMN clip_model_version TEXT;
|
||
|
||
-- Partial index for the backfill drain. Mirrors the shape of
|
||
-- `idx_image_exif_date_backfill`: candidate rows are those with a
|
||
-- known content_hash (so we don't race the unhashed backlog) but no
|
||
-- embedding yet. SELECT cost stays O(missing rows) instead of full
|
||
-- table scan once the column is mostly populated.
|
||
CREATE INDEX IF NOT EXISTS idx_image_exif_clip_backfill
|
||
ON image_exif (id)
|
||
WHERE clip_embedding IS NULL AND content_hash IS NOT NULL;
|