Make the embedding model swappable via env for A/B testing
Trialing Qwen3-Embedding-0.6B (1024-dim, instruct-prefixed queries) against nomic required code changes at every hardcoded seam; now it's a config flip plus a reembed_embeddings run. - EMBEDDING_DIM env (default 768) replaces every hardcoded dim check: daily summary / calendar / search / location DAOs, Ollama batch validation, reembed_embeddings - entities gains the dim guard it never had — a wrong-dim vector silently kills dedup/recall (cosine over mismatched lengths is 0), so store None and warn instead - embed_query / embed_document split with EMBED_QUERY_PREFIX / EMBED_DOCUMENT_PREFIX (literal \n expanded): retrieval models treat the two sides differently — nomic wants search_query:/search_document:, Qwen3 wants Instruct:...\nQuery: on queries only. All query-side call sites and all corpus writers now declare their side. - document the contract in CLAUDE.md: change the model or any of these vars → re-run reembed_embeddings or search is garbage Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
@@ -645,6 +645,14 @@ OPENROUTER_APP_TITLE=ImageApi # Optional attribution header
|
||||
# re-embedding — mixed vector spaces break similarity search.
|
||||
LLM_BACKEND=ollama
|
||||
|
||||
# Embedding model contract. Corpus and queries must be embedded by the same
|
||||
# model with matching prefixes — after changing the embed model or any of
|
||||
# these, run `cargo run --bin reembed_embeddings` (all tables) or search is
|
||||
# garbage. Prefix values may contain a literal \n (expanded to a newline).
|
||||
EMBEDDING_DIM=768 # 768 = nomic-embed-text v1.5; 1024 = Qwen3-Embedding-0.6B
|
||||
EMBED_QUERY_PREFIX= # nomic: "search_query: " | Qwen3: "Instruct: <task>\nQuery: "
|
||||
EMBED_DOCUMENT_PREFIX= # nomic: "search_document: " | Qwen3: leave empty
|
||||
|
||||
# llama.cpp / llama-swap (used when LLM_BACKEND=llamacpp). OpenAI-compatible
|
||||
# proxy hosting one or more llama-server processes. Chat models receive
|
||||
# images directly via content-parts (all models assumed vision-capable).
|
||||
|
||||
Reference in New Issue
Block a user