faces: phase 4 — people-tag bootstrap + auto-bind on detection

Wires the existing string people-tags into the new persons table and
auto-binds new detections to a same-named person when the photo carries
exactly one matching tag. ImageApi has no notion of which tags are
people-tags today (purely a user mental model), so this is operator-
confirmed: the suggester surfaces candidates with a heuristic flag, the
operator confirms, then bootstrap creates persons rows. Auto-bind
follows on every detection thereafter.

New endpoints:
  GET  /tags/people-bootstrap-candidates
       Per case-insensitive name group: display name (most-frequent
       capitalization), normalized lowercase, summed usage_count,
       looks_like_person heuristic flag, already_exists check against
       the persons table. Sorted persons-likely-first then by count.
  POST /persons/bootstrap
       Body: {names: [string]}. Idempotent — pre-fetches the existing-
       name set so a duplicate request reports per-row "already exists"
       instead of 409-ing each insert. Created rows get
       created_from_tag=true; failed rows surface in `skipped` with a
       reason.

looks_like_person heuristic — conservative on purpose because the
operator confirms in the UI:
  - 1–2 whitespace-separated words
  - Each word starts uppercase, no digits anywhere
  - Single-word names not on a small denylist (cat, christmas, beach,
    sunset, untagged, ...). Two-word names skip the denylist so
    "Sarah Smith" is never false-rejected.

FaceDao additions:
  - find_persons_by_names_ci — bulk lowercase-name → person_id lookup
    via sql_query (Diesel's BoxedSelectStatement + LOWER() doesn't
    play well with the type system).
  - person_reference_embedding — L2-normalized mean of a person's
    detected embeddings, *filtered by model_version* so a future
    buffalo_xl row can never contaminate an in-flight buffalo_l auto-
    bind decision. Returns None when the person has no faces yet.
  - assign_face_to_person — sets face_detections.person_id and, only
    when persons.cover_face_id is NULL, claims this face as cover. The
    UI's hand-picked cover survives later auto-binds.
  - decode_embedding_bytes / cosine_similarity helpers — pub(crate)
    so face_watch can decode the wire bytes once and feed them through
    the cosine threshold.

Auto-bind in face_watch::process_one:
  After every successful detect, for each newly-stored auto face we
  pull the photo's tags, look up which (if any) map to existing
  persons, and:
    - skip when zero or multiple distinct persons are matched
      (multi-match is genuinely ambiguous; cluster suggester handles it)
    - on first face for a person: bind unconditionally so bootstrap can
      ever produce a usable reference
    - thereafter: bind iff cosine(new_emb, person_ref) >=
      FACE_AUTOBIND_MIN_COS (default 0.4, env-tunable to 0..=1)
  The reference embedding comes from person_reference_embedding under
  the same model_version as the candidate, so a model upgrade never
  silently re-anchors a person's centroid.

Plumbing: watch_files now constructs its own SqliteTagDao alongside the
other watcher DAOs and threads it through process_new_files →
run_face_detection_pass → process_one. The handler-side TagDao
registration in main.rs already covers bootstrap_candidates_handler;
no extra app_data wiring needed.

Tests: 8 new (faces.rs):
  - looks_like_person accepts/rejects/two-word-skips-denylist (3)
  - cosine_similarity on identical / orthogonal / opposite / mismatch /
    zero / empty inputs
  - decode_embedding_bytes round-trip + size validation
  - find_persons_by_names_ci groups case + handles empty input
  - person_reference_embedding filters by model_version (buffalo_l ref
    must not include buffalo_xl rows)
  - assign_face_to_person sets cover when unset, doesn't overwrite

cargo test --lib: 179 / 0; fmt + clippy clean for new code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-04-29 18:55:01 +00:00
parent f985a0d658
commit 1859399759
3 changed files with 997 additions and 49 deletions

View File

@@ -16,10 +16,11 @@
use crate::ai::face_client::{DetectMeta, FaceClient, FaceDetectError};
use crate::exif;
use crate::faces::{FaceDao, InsertFaceDetectionInput};
use crate::faces::{self, FaceDao, InsertFaceDetectionInput};
use crate::file_types;
use crate::libraries::Library;
use crate::memories::PathExcluder;
use crate::tags::TagDao;
use log::{debug, info, warn};
use std::path::Path;
use std::sync::{Arc, Mutex};
@@ -41,6 +42,7 @@ pub fn run_face_detection_pass(
excluded_dirs: &[String],
face_client: &FaceClient,
face_dao: Arc<Mutex<Box<dyn FaceDao>>>,
tag_dao: Arc<Mutex<Box<dyn TagDao>>>,
candidates: Vec<FaceCandidate>,
) {
if !face_client.is_enabled() {
@@ -94,13 +96,22 @@ pub fn run_face_detection_pass(
let permit_sem = sem.clone();
let face_client = face_client.clone();
let face_dao = face_dao.clone();
let tag_dao = tag_dao.clone();
let library_root = library_root.clone();
handles.push(tokio::spawn(async move {
// acquire_owned would let us drop the permit explicitly
// before await points; for a one-shot call into Apollo
// the simpler bounded acquire is enough.
let _permit = permit_sem.acquire().await.expect("face semaphore");
process_one(library_id, &library_root, cand, &face_client, face_dao).await;
process_one(
library_id,
&library_root,
cand,
&face_client,
face_dao,
tag_dao,
)
.await;
}));
}
for h in handles {
@@ -117,6 +128,7 @@ async fn process_one(
cand: FaceCandidate,
face_client: &FaceClient,
face_dao: Arc<Mutex<Box<dyn FaceDao>>>,
tag_dao: Arc<Mutex<Box<dyn TagDao>>>,
) {
let abs = Path::new(library_root).join(&cand.rel_path);
// Read the bytes off disk in a blocking-friendly task. Filesystem IO
@@ -148,60 +160,85 @@ async fn process_one(
match face_client.detect(bytes, meta).await {
Ok(resp) => {
// Hold the dao lock only across the synchronous DB writes.
let mut dao = face_dao.lock().expect("face dao");
if resp.faces.is_empty() {
if let Err(e) = dao.mark_status(
&ctx,
library_id,
&cand.content_hash,
&cand.rel_path,
"no_faces",
&resp.model_version,
) {
warn!(
"face_watch: mark no_faces failed for {}: {:?}",
cand.rel_path, e
);
}
debug!(
"face_watch: {} → no faces (model {})",
cand.rel_path, resp.model_version
);
} else {
let face_count = resp.faces.len();
for face in &resp.faces {
let emb = match face.decode_embedding() {
Ok(b) => b,
Err(e) => {
warn!("face_watch: bad embedding for {}: {:?}", cand.rel_path, e);
continue;
}
};
if let Err(e) = dao.store_detection(
// Stage 1: persist detections, holding the dao lock only
// across synchronous DB writes.
let mut stored_for_autobind: Vec<(i32, Vec<f32>)> = Vec::new();
{
let mut dao = face_dao.lock().expect("face dao");
if resp.faces.is_empty() {
if let Err(e) = dao.mark_status(
&ctx,
InsertFaceDetectionInput {
library_id,
content_hash: cand.content_hash.clone(),
rel_path: cand.rel_path.clone(),
bbox: Some((face.bbox.x, face.bbox.y, face.bbox.w, face.bbox.h)),
embedding: Some(emb),
confidence: Some(face.confidence),
source: "auto".to_string(),
person_id: None,
status: "detected".to_string(),
model_version: resp.model_version.clone(),
},
library_id,
&cand.content_hash,
&cand.rel_path,
"no_faces",
&resp.model_version,
) {
warn!(
"face_watch: store_detection failed for {}: {:?}",
"face_watch: mark no_faces failed for {}: {:?}",
cand.rel_path, e
);
}
debug!(
"face_watch: {} → no faces (model {})",
cand.rel_path, resp.model_version
);
} else {
let face_count = resp.faces.len();
for face in &resp.faces {
let emb = match face.decode_embedding() {
Ok(b) => b,
Err(e) => {
warn!("face_watch: bad embedding for {}: {:?}", cand.rel_path, e);
continue;
}
};
// Decode the f32 vector once for auto-bind comparison.
let emb_floats = faces::decode_embedding_bytes(&emb);
match dao.store_detection(
&ctx,
InsertFaceDetectionInput {
library_id,
content_hash: cand.content_hash.clone(),
rel_path: cand.rel_path.clone(),
bbox: Some((face.bbox.x, face.bbox.y, face.bbox.w, face.bbox.h)),
embedding: Some(emb),
confidence: Some(face.confidence),
source: "auto".to_string(),
person_id: None,
status: "detected".to_string(),
model_version: resp.model_version.clone(),
},
) {
Ok(row) => {
if let Some(floats) = emb_floats {
stored_for_autobind.push((row.id, floats));
}
}
Err(e) => warn!(
"face_watch: store_detection failed for {}: {:?}",
cand.rel_path, e
),
}
}
info!(
"face_watch: {} → {} face(s) ({}ms, {})",
cand.rel_path, face_count, resp.duration_ms, resp.model_version
);
}
info!(
"face_watch: {} → {} face(s) ({}ms, {})",
cand.rel_path, face_count, resp.duration_ms, resp.model_version
}
// Stage 2: auto-bind newly-stored faces against same-named
// people-tags. Done outside the dao lock so the lookups don't
// serialize with concurrent detect tasks.
if !stored_for_autobind.is_empty() {
try_auto_bind(
&ctx,
&cand.rel_path,
&resp.model_version,
stored_for_autobind,
&tag_dao,
&face_dao,
);
}
}
@@ -243,6 +280,137 @@ async fn process_one(
}
}
/// Auto-bind newly-detected faces to a same-named person, when a tag on the
/// photo unambiguously identifies one. Driven by `FACE_AUTOBIND_MIN_COS`
/// (default 0.4): the new face's embedding must reach this cosine
/// similarity against the L2-normalized mean of the person's existing
/// faces. The first face for a person binds unconditionally — there's
/// nothing to compare against, and the alternative ("never bind without
/// a reference") would mean bootstrap never kicks off.
///
/// Multi-match (the photo carries tags for two different known persons)
/// is intentionally a no-op — we can't tell which face is which without
/// additional matching. Those faces stay unassigned for the cluster
/// suggester (Phase 6) to handle.
fn try_auto_bind(
ctx: &opentelemetry::Context,
rel_path: &str,
model_version: &str,
new_faces: Vec<(i32, Vec<f32>)>, // (face_id, decoded embedding)
tag_dao: &Arc<Mutex<Box<dyn TagDao>>>,
face_dao: &Arc<Mutex<Box<dyn FaceDao>>>,
) {
// 1. Pull the photo's tags.
let tag_names: Vec<String> = {
let mut td = tag_dao.lock().expect("tag dao");
match td.get_tags_for_path(ctx, rel_path) {
Ok(tags) => tags.into_iter().map(|t| t.name).collect(),
Err(e) => {
warn!(
"face_watch: get_tags_for_path failed for {}: {:?}",
rel_path, e
);
return;
}
}
};
if tag_names.is_empty() {
return;
}
// 2. Find tags that map to existing persons (case-insensitive).
let person_for_tag: std::collections::HashMap<String, i32> = {
let mut fd = face_dao.lock().expect("face dao");
match fd.find_persons_by_names_ci(ctx, &tag_names) {
Ok(m) => m,
Err(e) => {
warn!(
"face_watch: find_persons_by_names_ci failed for {}: {:?}",
rel_path, e
);
return;
}
}
};
// 3. Multi-match: ambiguous, skip. Single match: candidate person.
let unique_person_ids: std::collections::HashSet<i32> =
person_for_tag.values().copied().collect();
if unique_person_ids.len() != 1 {
if !unique_person_ids.is_empty() {
debug!(
"face_watch: {} carries tags for {} different persons; skipping auto-bind",
rel_path,
unique_person_ids.len()
);
}
return;
}
let person_id = *unique_person_ids.iter().next().expect("nonempty set");
let threshold: f32 = std::env::var("FACE_AUTOBIND_MIN_COS")
.ok()
.and_then(|s| s.parse().ok())
.filter(|t: &f32| *t >= 0.0 && *t <= 1.0)
.unwrap_or(0.4);
// 4. Reference embedding (if any) under the same model_version.
let reference: Option<Vec<f32>> = {
let mut fd = face_dao.lock().expect("face dao");
match fd.person_reference_embedding(ctx, person_id, model_version) {
Ok(r) => r,
Err(e) => {
warn!(
"face_watch: person_reference_embedding failed for person {}: {:?}",
person_id, e
);
return;
}
}
};
// 5. Bind each new face that meets the criterion. Hold the lock once
// for the whole batch; assign_face_to_person uses its own short
// transaction internally.
let mut fd = face_dao.lock().expect("face dao");
for (face_id, emb) in new_faces {
let bind = match &reference {
None => {
// Person has no faces yet — first one wins so bootstrap
// can ever produce a usable reference. After this row
// commits, future faces evaluate against it.
debug!(
"face_watch: auto-binding first face {} → person {} (no reference yet)",
face_id, person_id
);
true
}
Some(ref_vec) => {
let sim = faces::cosine_similarity(&emb, ref_vec);
if sim >= threshold {
debug!(
"face_watch: auto-binding face {} → person {} (cos={:.3} ≥ {:.3})",
face_id, person_id, sim, threshold
);
true
} else {
debug!(
"face_watch: leaving face {} unassigned (cos={:.3} < {:.3} for person {})",
face_id, sim, threshold, person_id
);
false
}
}
};
if bind && let Err(e) = fd.assign_face_to_person(ctx, face_id, person_id) {
warn!(
"face_watch: assign_face_to_person failed (face={}, person={}): {:?}",
face_id, person_id, e
);
}
}
}
/// Drop candidates whose path matches the watcher's `EXCLUDED_DIRS` rules.
/// Pulled out for unit testing — the same `PathExcluder` /memories uses,
/// just applied at the face-detect candidate set instead of the memories