Make the embedding model swappable via env for A/B testing

Trialing Qwen3-Embedding-0.6B (1024-dim, instruct-prefixed queries) against nomic required code changes at every hardcoded seam; now it's a config flip plus a reembed_embeddings run. - EMBEDDING_DIM env (default 768) replaces every hardcoded dim check: daily summary / calendar / search / location DAOs, Ollama batch validation, reembed_embeddings - entities gains the dim guard it never had — a wrong-dim vector silently kills dedup/recall (cosine over mismatched lengths is 0), so store None and warn instead - embed_query / embed_document split with EMBED_QUERY_PREFIX / EMBED_DOCUMENT_PREFIX (literal \n expanded): retrieval models treat the two sides differently — nomic wants search_query:/search_document:, Qwen3 wants Instruct:...\nQuery: on queries only. All query-side call sites and all corpus writers now declare their side. - document the contract in CLAUDE.md: change the model or any of these vars → re-run reembed_embeddings or search is garbage Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 21:40:40 -04:00
parent b1493f5aca
commit efd05db523
12 changed files with 159 additions and 67 deletions
@@ -141,7 +141,7 @@ async fn embed_with_truncation(llm: &LocalLlm, text: &str) -> Result<(Vec<f32>,
    let mut text = text.to_string();
    let mut truncated = false;
    loop {
-        match llm.embed(&text).await {
+        match llm.embed_document(&text).await {
            Ok(emb) => return Ok((emb, truncated)),
            Err(e)
                if e.to_string().contains("too large to process") && text.chars().count() > 64 =>
@@ -194,14 +194,15 @@ async fn reembed_table(
            }
        };

-        // The whole pipeline (DAO checks, stored corpora) assumes 768 dims.
-        // A different dim means the active backend is not serving a
-        // nomic-compatible model — stop rather than corrupt the table.
+        // The whole pipeline (DAO checks, stored corpora) assumes
+        // EMBEDDING_DIM dims. A mismatch means the active embed slot is not
+        // serving the configured model — stop rather than corrupt the table.
        anyhow::ensure!(
-            new_emb.len() == 768,
-            "backend returned {}-dim embedding (expected 768) — '{}' is not \
-             serving a nomic-embed-text-v1.5-compatible model",
+            new_emb.len() == image_api::ai::embedding_dim(),
+            "backend returned {}-dim embedding (expected {}) — '{}' does not \
+             match the configured EMBEDDING_DIM",
            new_emb.len(),
+            image_api::ai::embedding_dim(),
            llm.embedding_model_version()
        );