Make the embedding model swappable via env for A/B testing
Trialing Qwen3-Embedding-0.6B (1024-dim, instruct-prefixed queries) against nomic required code changes at every hardcoded seam; now it's a config flip plus a reembed_embeddings run. - EMBEDDING_DIM env (default 768) replaces every hardcoded dim check: daily summary / calendar / search / location DAOs, Ollama batch validation, reembed_embeddings - entities gains the dim guard it never had — a wrong-dim vector silently kills dedup/recall (cosine over mismatched lengths is 0), so store None and warn instead - embed_query / embed_document split with EMBED_QUERY_PREFIX / EMBED_DOCUMENT_PREFIX (literal \n expanded): retrieval models treat the two sides differently — nomic wants search_query:/search_document:, Qwen3 wants Instruct:...\nQuery: on queries only. All query-side call sites and all corpus writers now declare their side. - document the contract in CLAUDE.md: change the model or any of these vars → re-run reembed_embeddings or search is garbage Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
@@ -141,7 +141,7 @@ async fn embed_with_truncation(llm: &LocalLlm, text: &str) -> Result<(Vec<f32>,
|
||||
let mut text = text.to_string();
|
||||
let mut truncated = false;
|
||||
loop {
|
||||
match llm.embed(&text).await {
|
||||
match llm.embed_document(&text).await {
|
||||
Ok(emb) => return Ok((emb, truncated)),
|
||||
Err(e)
|
||||
if e.to_string().contains("too large to process") && text.chars().count() > 64 =>
|
||||
@@ -194,14 +194,15 @@ async fn reembed_table(
|
||||
}
|
||||
};
|
||||
|
||||
// The whole pipeline (DAO checks, stored corpora) assumes 768 dims.
|
||||
// A different dim means the active backend is not serving a
|
||||
// nomic-compatible model — stop rather than corrupt the table.
|
||||
// The whole pipeline (DAO checks, stored corpora) assumes
|
||||
// EMBEDDING_DIM dims. A mismatch means the active embed slot is not
|
||||
// serving the configured model — stop rather than corrupt the table.
|
||||
anyhow::ensure!(
|
||||
new_emb.len() == 768,
|
||||
"backend returned {}-dim embedding (expected 768) — '{}' is not \
|
||||
serving a nomic-embed-text-v1.5-compatible model",
|
||||
new_emb.len() == image_api::ai::embedding_dim(),
|
||||
"backend returned {}-dim embedding (expected {}) — '{}' does not \
|
||||
match the configured EMBEDDING_DIM",
|
||||
new_emb.len(),
|
||||
image_api::ai::embedding_dim(),
|
||||
llm.embedding_model_version()
|
||||
);
|
||||
|
||||
|
||||
Reference in New Issue
Block a user