Make the embedding model swappable via env for A/B testing
Trialing Qwen3-Embedding-0.6B (1024-dim, instruct-prefixed queries) against nomic required code changes at every hardcoded seam; now it's a config flip plus a reembed_embeddings run. - EMBEDDING_DIM env (default 768) replaces every hardcoded dim check: daily summary / calendar / search / location DAOs, Ollama batch validation, reembed_embeddings - entities gains the dim guard it never had — a wrong-dim vector silently kills dedup/recall (cosine over mismatched lengths is 0), so store None and warn instead - embed_query / embed_document split with EMBED_QUERY_PREFIX / EMBED_DOCUMENT_PREFIX (literal \n expanded): retrieval models treat the two sides differently — nomic wants search_query:/search_document:, Qwen3 wants Instruct:...\nQuery: on queries only. All query-side call sites and all corpus writers now declare their side. - document the contract in CLAUDE.md: change the model or any of these vars → re-run reembed_embeddings or search is garbage Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
+8
-6
@@ -43,14 +43,16 @@ impl LocalLlm {
|
||||
)
|
||||
}
|
||||
|
||||
/// Embed one string via the `LLM_BACKEND`-selected client.
|
||||
pub async fn embed(&self, text: &str) -> Result<Vec<f32>> {
|
||||
super::embed_one(&self.ollama, self.llamacpp.as_deref(), text).await
|
||||
/// Embed a search query (applies `EMBED_QUERY_PREFIX`). Callers must
|
||||
/// pick query vs document — retrieval models treat the two sides
|
||||
/// differently and an unmarked embed invites prefix-mismatch bugs.
|
||||
pub async fn embed_query(&self, text: &str) -> Result<Vec<f32>> {
|
||||
super::embed_query(&self.ollama, self.llamacpp.as_deref(), text).await
|
||||
}
|
||||
|
||||
/// Embed a batch via the `LLM_BACKEND`-selected client.
|
||||
pub async fn embed_batch(&self, texts: &[&str]) -> Result<Vec<Vec<f32>>> {
|
||||
super::embed_many(&self.ollama, self.llamacpp.as_deref(), texts).await
|
||||
/// Embed corpus text (applies `EMBED_DOCUMENT_PREFIX`).
|
||||
pub async fn embed_document(&self, text: &str) -> Result<Vec<f32>> {
|
||||
super::embed_document(&self.ollama, self.llamacpp.as_deref(), text).await
|
||||
}
|
||||
|
||||
/// Single-shot local text generation via the `LLM_BACKEND`-selected
|
||||
|
||||
Reference in New Issue
Block a user