clip-search: backlog drain + /photos/search endpoint

Wires the persistence layer for CLIP semantic search. The watcher's
per-tick drain encodes any image_exif row with a known content_hash
but no clip_embedding via Apollo (cap CLIP_BACKLOG_MAX_PER_TICK,
default 32). On a query, /photos/search encodes the text via Apollo
and reranks every stored embedding in-memory.

ExifDao additions:
- list_clip_unencoded_candidates — partial-index scan for drain
- backfill_clip_embedding — touches only the two new columns
- list_clip_index — dedup'd (hash, embedding) pull for search

clip_watch::run_clip_encoding_pass is the parallel fan-out — tokio
runtime per pass with CLIP_ENCODE_CONCURRENCY (default 4). No marker
rows for permanent failures yet; per-tick cap bounds the retry cost.

/photos/search params: q, limit, threshold (default 0.20), library,
model_version. Response is intentionally minimal (path + score) so
the frontend joins against existing photo-metadata routes lazily.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-05-14 14:00:41 -04:00
parent 8d9e76cf15
commit 32195ed89e
9 changed files with 875 additions and 0 deletions

View File

@@ -220,6 +220,74 @@ pub fn backfill_missing_date_taken(
/// unscanned image_exif rows directly via the FaceDao anti-join and
/// hands them to the existing detection pass. Runs on every tick (not
/// just full scans) so the backlog moves at quick-scan cadence.
/// Per-tick CLIP encoding drain. Mirrors `process_face_backlog`: pull
/// up to `CLIP_BACKLOG_MAX_PER_TICK` candidates with a known
/// `content_hash` but no `clip_embedding`, hand them to
/// `clip_watch::run_clip_encoding_pass` for parallel fan-out, and let
/// that module write the result back via `backfill_clip_embedding`.
///
/// Idempotent — a row stays in the candidate set until its embedding
/// lands, so a transient failure (Apollo unreachable, CUDA OOM) just
/// defers to the next tick. Permanent failures (un-decodable bytes)
/// retry every tick at this point; future Branch may add a status
/// column like face_detections has.
pub fn process_clip_backlog(
context: &opentelemetry::Context,
library: &libraries::Library,
clip_client: &crate::ai::clip_client::ClipClient,
exif_dao: &Arc<Mutex<Box<dyn ExifDao>>>,
excluded_dirs: &[String],
) {
if !clip_client.is_enabled() {
return;
}
let cap: i64 = dotenv::var("CLIP_BACKLOG_MAX_PER_TICK")
.ok()
.and_then(|s| s.parse().ok())
.filter(|n: &i64| *n > 0)
.unwrap_or(32);
let rows: Vec<(String, String)> = {
let mut dao = exif_dao.lock().expect("exif dao");
match dao.list_clip_unencoded_candidates(context, library.id, cap) {
Ok(r) => r,
Err(e) => {
warn!(
"clip_watch: list_clip_unencoded_candidates failed for library '{}': {:?}",
library.name, e
);
return;
}
}
};
if rows.is_empty() {
return;
}
info!(
"clip_watch: backlog drain — encoding {} candidate(s) for library '{}' (cap={})",
rows.len(),
library.name,
cap
);
let candidates: Vec<crate::clip_watch::ClipCandidate> = rows
.into_iter()
.map(|(rel_path, content_hash)| crate::clip_watch::ClipCandidate {
rel_path,
content_hash,
})
.collect();
crate::clip_watch::run_clip_encoding_pass(
library,
excluded_dirs,
clip_client,
Arc::clone(exif_dao),
candidates,
);
}
pub fn process_face_backlog(
context: &opentelemetry::Context,
library: &libraries::Library,