faces: phase 2 — schema + manual face/person CRUD

Land the persistence model and HTTP surface for local face recognition.
Inference still lives in Apollo (Phase 1); this side adds the data home
plus every endpoint Apollo's UI and FileViewer-React will consume.

Schema (new migration 2026-04-29-000000_add_faces):
  - persons: visual identities. Optional entity_id bridges to the
    existing knowledge-graph entities table; auto-bridging is left to
    the management UI (we don't muddy LLM provenance from face rows).
    UNIQUE(name COLLATE NOCASE) so 'alice' / 'Alice' fold to one row.
  - face_detections: keyed on content_hash (cross-library dedup), with
    status='detected' carrying bbox + 512-d embedding BLOB, and
    'no_faces' / 'failed' marker rows that tell Phase 3's file watcher
    not to re-scan. Marker invariant enforced via CHECK; partial UNIQUE
    on content_hash WHERE status='no_faces' guards against double-marks.

Schema regenerated with `diesel print-schema` against a clean migration
run; joinables added for face_detections → libraries / persons and
persons → entities.

face_client.rs (sibling of apollo_client.rs):
  - reqwest multipart, 60 s timeout (CPU inference on a backlog can be
    slow; bounded threadpool on Apollo serializes calls anyway).
  - FaceDetectError::{Permanent, Transient, Disabled} — Phase 3 keys
    its marker-row decision on this. 422 → mark failed, 5xx → defer.
  - APOLLO_FACE_API_BASE_URL falls back to APOLLO_API_BASE_URL when
    unset; both unset = is_enabled() false, callers no-op.

faces.rs (DAO + handlers):
  - SqliteFaceDao implements the full FaceDao trait; person face counts
    go through sql_query because diesel's BoxedSelectStatement +
    group_by trips trait-resolver recursion.
  - merge_persons re-points face rows in a transaction, copies notes
    when target's are empty, deletes src.
  - manual POST /image/faces resolves content_hash through image_exif,
    crops the user-drawn bbox with 10% padding (detector wants context
    around ears/jaw), POSTs the crop to face_client.embed for a real
    ArcFace vector, then inserts source='manual'.
  - Cluster-suggest (Phase 6) gets its data from
    GET /faces/embeddings — base64-encoded paged BLOBs so Apollo's
    DBSCAN can stream them without ImageApi pre-aggregating.

Endpoints registered alongside add_*_services in main.rs:
  GET    /faces/stats?library=
  GET    /faces/embeddings?library=&unassigned=&limit=&offset=
  GET    /image/faces?path=&library=
  POST   /image/faces                        (manual create via embed)
  PATCH  /image/faces/{id}
  DELETE /image/faces/{id}
  GET    /persons?library=
  POST   /persons
  GET    /persons/{id}
  PATCH  /persons/{id}
  DELETE /persons/{id}?cascade=set_null|delete   (set_null default)
  POST   /persons/{id}/merge
  GET    /persons/{id}/faces?library=

The file-watch hook (Phase 3) and the rerun-on-one-photo handler
(Phase 6) live behind the FaceDao methods marked dead_code today —
they're called only when those phases land. Same shape for the trait
methods that aren't reached by Phase 2 routes.

Tests: 3 DAO unit tests cover person CRUD + case-insensitive uniqueness,
marker-row idempotency (mark_status is a no-op when any row exists),
and merge re-pointing faces.

Cargo.toml: reqwest gains the `multipart` feature.

cargo build / cargo test --lib / cargo fmt / cargo clippy --all-targets
all clean for the new code; the two pre-existing test_path_excluder
failures and the pre-existing sort_by clippy warnings are unrelated and
present on master.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-04-29 18:03:42 +00:00
parent 6642db3c8b
commit 860169032b
13 changed files with 2344 additions and 1 deletions

312
src/ai/face_client.rs Normal file
View File

@@ -0,0 +1,312 @@
//! Thin async HTTP client for Apollo's `/api/internal/faces/*` endpoints.
//!
//! Apollo (the personal location-history viewer at the sibling repo) hosts the
//! insightface inference service. This client is the ImageApi side of the
//! contract — it shoves image bytes through `/detect` and returns boxes +
//! 512-d ArcFace embeddings, plus a single-embedding `/embed` for the manual
//! face-create flow.
//!
//! Mirrors `apollo_client.rs` shape: optional base URL (None = disabled, the
//! file watcher and manual-create handlers no-op), reqwest client with a
//! generous timeout because CPU inference on a backlog can take many seconds
//! per photo.
//!
//! Configured via `APOLLO_FACE_API_BASE_URL`, falling back to
//! `APOLLO_API_BASE_URL` when the dedicated var is unset (single-Apollo
//! deploys are the common case). Both unset → `is_enabled()` returns false.
//!
//! Wire format: multipart/form-data with `file=<bytes>` and `meta=<json>`.
//! `meta` carries `{content_hash, library_id, rel_path, orientation?,
//! model_version?}` — useful for Apollo-side logging and idempotency, ignored
//! by Apollo today but part of the stable wire contract so future versions
//! can act on it without a client change.
//!
//! Error mapping (reflected in [`FaceDetectError`]):
//! - 422 `decode_failed` → permanent: ImageApi marks `status='failed'` and
//! doesn't retry until manual rerun.
//! - 200 with `faces:[]` → `status='no_faces'` marker row.
//! - 503 `cuda_oom` / `engine_unavailable` → defer-and-retry: no marker
//! written.
//! - Any other 5xx / network error → defer.
use anyhow::{Context, Result};
use base64::Engine;
use reqwest::Client;
use serde::{Deserialize, Serialize};
use std::time::Duration;
#[derive(Debug, Clone, Serialize)]
pub struct DetectMeta {
pub content_hash: String,
pub library_id: i32,
pub rel_path: String,
/// EXIF orientation int (1..8). Apollo applies `exif_transpose` on the
/// bytes before inference, so this is informational only — supply when
/// the bytes were extracted from a RAW preview that lost the tag.
#[serde(skip_serializing_if = "Option::is_none")]
pub orientation: Option<i32>,
/// Echoed back in the response. ImageApi stores it in
/// `face_detections.model_version`.
#[serde(skip_serializing_if = "Option::is_none")]
pub model_version: Option<String>,
}
// Wire shape for the bbox sub-object Apollo returns. Read by Phase 3's
// file-watch hook; silence the dead-code lint until then.
#[allow(dead_code)]
#[derive(Debug, Clone, Deserialize)]
pub struct DetectedBbox {
pub x: f32,
pub y: f32,
pub w: f32,
pub h: f32,
}
#[allow(dead_code)] // bbox consumed by Phase 3 file-watch hook
#[derive(Debug, Clone, Deserialize)]
pub struct DetectedFace {
pub bbox: DetectedBbox,
pub confidence: f32,
/// base64 of 2048 bytes (512×f32 LE). ImageApi stores the raw bytes
/// verbatim as a BLOB — see `decode_embedding` for the unpack.
pub embedding: String,
}
impl DetectedFace {
/// Decode the wire-format embedding back into raw bytes for storage.
/// Returns the 2048-byte little-endian f32 buffer or an error if the
/// base64 is malformed or the wrong length.
pub fn decode_embedding(&self) -> Result<Vec<u8>> {
let bytes = base64::engine::general_purpose::STANDARD
.decode(self.embedding.as_bytes())
.context("face embedding base64 decode")?;
if bytes.len() != 2048 {
anyhow::bail!(
"face embedding wrong size: got {} bytes, expected 2048",
bytes.len()
);
}
Ok(bytes)
}
}
#[allow(dead_code)] // duration_ms logged by Phase 3 file-watch hook
#[derive(Debug, Clone, Deserialize)]
pub struct DetectResponse {
pub model_version: String,
pub duration_ms: i64,
pub faces: Vec<DetectedFace>,
}
#[derive(Debug, Clone, Deserialize)]
#[allow(dead_code)] // Reported by Apollo; useful for future health-driven backoff
pub struct FaceHealth {
pub loaded: bool,
pub providers: Vec<String>,
pub model_version: String,
pub det_size: i32,
#[serde(default)]
pub load_error: Option<String>,
}
/// Distinguishes permanent failures (don't retry) from transient ones
/// (defer and retry on next scan tick). The file-watch hook keys its
/// marker-row decision on this — a `Permanent` outcome writes
/// `status='failed'`, a `Transient` outcome writes nothing so the next
/// pass tries again.
#[derive(Debug)]
pub enum FaceDetectError {
/// Apollo refused the bytes for a reason that won't change on retry
/// (decode failure, zero-dim image). Mark `status='failed'`.
Permanent(anyhow::Error),
/// Apollo couldn't process this turn but might next time (CUDA OOM,
/// engine not loaded yet, network hiccup). Don't mark anything.
Transient(anyhow::Error),
/// Feature is disabled (no `APOLLO_FACE_API_BASE_URL`). Caller should
/// silently no-op — same shape as `apollo_client::is_enabled()` false.
Disabled,
}
impl std::fmt::Display for FaceDetectError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
FaceDetectError::Permanent(e) => write!(f, "permanent: {e}"),
FaceDetectError::Transient(e) => write!(f, "transient: {e}"),
FaceDetectError::Disabled => write!(f, "face client disabled"),
}
}
}
impl std::error::Error for FaceDetectError {}
#[derive(Clone)]
pub struct FaceClient {
client: Client,
/// `None` → disabled. Trim trailing slash at construction so url
/// building doesn't double up.
base_url: Option<String>,
}
impl FaceClient {
pub fn new(base_url: Option<String>) -> Self {
// 60 s timeout: CPU inference on a backlog can take many seconds
// per photo, especially the first call into a cold GPU. Apollo's
// bounded threadpool (1 worker on CUDA) means concurrent calls
// queue server-side; 60 s is enough headroom for a few items in
// the queue without surfacing a false transient.
let timeout_secs = std::env::var("FACE_DETECT_TIMEOUT_SEC")
.ok()
.and_then(|s| s.parse::<u64>().ok())
.unwrap_or(60);
let client = Client::builder()
.timeout(Duration::from_secs(timeout_secs))
.build()
.expect("reqwest client build");
Self {
client,
base_url: base_url.map(|u| u.trim_end_matches('/').to_string()),
}
}
pub fn is_enabled(&self) -> bool {
self.base_url.is_some()
}
/// Detect every face in `bytes`. ImageApi calls this from the file-watch
/// hook (Phase 3) and from the manual rerun handler. Empty `faces[]` in
/// the response is the no-faces signal — caller writes a marker row.
#[allow(dead_code)] // Phase 3 file-watch hook + rerun handler
pub async fn detect(
&self,
bytes: Vec<u8>,
meta: DetectMeta,
) -> std::result::Result<DetectResponse, FaceDetectError> {
let Some(base) = self.base_url.as_deref() else {
return Err(FaceDetectError::Disabled);
};
let url = format!("{}/api/internal/faces/detect", base);
self.post_multipart(&url, bytes, &meta).await
}
/// Single-embedding endpoint for the manual face-create flow. Caller
/// crops the image to the user-drawn bbox and passes those bytes; we
/// run detection inside the crop and return the highest-confidence
/// face's embedding. Apollo returns 422 `no_face_in_crop` when the
/// box missed — surfaced here as `Permanent`.
pub async fn embed(
&self,
bytes: Vec<u8>,
meta: DetectMeta,
) -> std::result::Result<DetectResponse, FaceDetectError> {
let Some(base) = self.base_url.as_deref() else {
return Err(FaceDetectError::Disabled);
};
let url = format!("{}/api/internal/faces/embed", base);
self.post_multipart(&url, bytes, &meta).await
}
/// Engine reachability + provider/model report. Used by ImageApi for a
/// startup sanity check; not on the hot path.
#[allow(dead_code)] // Phase 3 startup probe
pub async fn health(&self) -> Result<FaceHealth> {
let base = self.base_url.as_deref().context("face client disabled")?;
let url = format!("{}/api/internal/faces/health", base);
let resp = self.client.get(&url).send().await?.error_for_status()?;
let body: FaceHealth = resp.json().await?;
Ok(body)
}
async fn post_multipart(
&self,
url: &str,
bytes: Vec<u8>,
meta: &DetectMeta,
) -> std::result::Result<DetectResponse, FaceDetectError> {
let meta_json = serde_json::to_string(meta)
.map_err(|e| FaceDetectError::Permanent(anyhow::anyhow!("meta serialize: {e}")))?;
let form = reqwest::multipart::Form::new()
.text("meta", meta_json)
.part(
"file",
reqwest::multipart::Part::bytes(bytes)
.file_name(meta.rel_path.clone())
.mime_str("application/octet-stream")
.unwrap_or_else(|_| reqwest::multipart::Part::bytes(Vec::new())),
);
let resp = match self.client.post(url).multipart(form).send().await {
Ok(r) => r,
Err(e) if e.is_timeout() || e.is_connect() => {
return Err(FaceDetectError::Transient(anyhow::anyhow!(
"face client network: {e}"
)));
}
Err(e) => {
return Err(FaceDetectError::Transient(anyhow::anyhow!(
"face client request: {e}"
)));
}
};
let status = resp.status();
if status.is_success() {
let body: DetectResponse = resp.json().await.map_err(|e| {
FaceDetectError::Transient(anyhow::anyhow!("face response decode: {e}"))
})?;
return Ok(body);
}
let body_text = resp.text().await.unwrap_or_default();
// Apollo encodes its error class in the JSON body's `detail`. Try
// to parse it; fall back to status-only classification.
let detail_code = serde_json::from_str::<serde_json::Value>(&body_text)
.ok()
.and_then(|v| {
// detail can be a string ("decode_failed") or an object
// ({"code": "cuda_oom", ...}) depending on the endpoint
// and Apollo's response shape — handle both.
v.get("detail")
.and_then(|d| d.as_str().map(str::to_string))
.or_else(|| {
v.get("detail")
.and_then(|d| d.get("code"))
.and_then(|c| c.as_str())
.map(str::to_string)
})
})
.unwrap_or_default();
if status == reqwest::StatusCode::UNPROCESSABLE_ENTITY {
return Err(FaceDetectError::Permanent(anyhow::anyhow!(
"face detect 422 {}: {}",
detail_code,
body_text
)));
}
if status == reqwest::StatusCode::SERVICE_UNAVAILABLE {
return Err(FaceDetectError::Transient(anyhow::anyhow!(
"face detect 503 {}: {}",
detail_code,
body_text
)));
}
// Any other 4xx: be conservative and treat as Permanent so we
// don't loop forever on a stable rejection. Any other 5xx:
// Transient — likely intermittent.
if status.is_client_error() {
Err(FaceDetectError::Permanent(anyhow::anyhow!(
"face detect {} {}: {}",
status.as_u16(),
detail_code,
body_text
)))
} else {
Err(FaceDetectError::Transient(anyhow::anyhow!(
"face detect {} {}: {}",
status.as_u16(),
detail_code,
body_text
)))
}
}
}