New optional SamplingOverride forwarded to llama-server as
chat_template_kwargs.enable_thinking (gates Qwen3-style reasoning
blocks). None leaves the template default; other backends ignore it.
Wired through the agentic-insight and chat-turn request bodies/handlers.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ALL-mode over-constrains NL queries — the model maps several query words to
tags and few photos carry every one, zeroing the candidate set. Switch to
ANY (a photo matches if it has any named tag); the semantic CLIP ranking
provides precision within that pool. Exclude tags still filter out.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When structured filters are present they're the constraint and CLIP only ranks
within the candidate set, so drop the global similarity threshold for that
case. Previously the 0.2 whole-library threshold ran BEFORE intersecting with
the filters, discarding filter-matching photos that scored just under it (e.g.
a 2022 beach photo at 0.18) — producing after_struct_filter=0 even when matches
existed. Plain semantic (no filters) keeps the user's threshold.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The CLIP encode failure reason was only ever returned in the HTTP response
body, never logged server-side, making 502s from /photos/search opaque. Log
the underlying cause — network error to the URL, or the Apollo HTTP status +
response body — so CLIP-service problems are diagnosable from the ImageApi log.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pin the NL->structured translation to a small, fast model that can stay
co-resident with CLIP (and the chat model) so it never evicts them on a tight
VRAM budget. Precedence: UNIFIED_SEARCH_MODEL env > client-selected model >
configured default. Logs the effective model (backend.model()) so model A/B
tests are visible. Documented in .env.example.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Log the translated query (semantic/tags/place/date/media + has_struct), the
tag-filter file count, candidate-row + allowed-hash counts, and the CLIP
considered/hits/after-filter counts. Pinpoints which stage drops results to
zero (over-extracted filter, tag path mismatch, Any/All over-constraint, or
CLIP threshold). info-level for now while debugging.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add an optional `model` query param to /photos/search/unified, passed into
resolve_backend's overrides. The client sends the user's currently-selected
local model so the translation step reuses an already-loaded model instead of
forcing a llama-swap eviction + cold start. Falls back to the configured
default when absent. Still local only (no hybrid).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Composes the two existing engines (Path A orchestration):
- Translate NL -> StructuredQuery via local LLM, respecting LLM_BACKEND
(resolve_backend(Local) -> ollama or llama-swap; no hybrid).
- Forward-geocode the place name into a gps circle.
- Structured filters (tags/EXIF/geo/date/media) build a candidate set of EXIF
rows; CLIP ranks within it, joined by content_hash. Degenerate cases match
existing behavior: semantic-only -> plain CLIP; filters-only -> date-sorted.
- Echoes the interpreted query (incl. resolved place) for editable client chips.
Refactor: extracted reusable cores from clip_search (score_photos, resolve_hits,
parse_library_scope, score_error_response) shared by both endpoints. Removed the
Phase 1 allow-until-wired attributes now that nl_query + geo are consumed.
fmt + clippy clean; 23 backend tests pass (7 geo, 12 nl_query, 4 unified).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Foundation for the /photos/search/unified endpoint (Phase 2). Two new,
fully unit-tested pieces, not yet wired into a route (allow-until-wired,
mirroring llm_client.rs):
- ai/nl_query.rs: translate a free-text query into a StructuredQuery via one
grounded LLM call. Two-stage — the model emits names/ISO dates, then a pure
resolve step maps tag names against the real vocab and converts dates to
unix seconds. Hallucinated (non-vocab) tags are surfaced in unmatched_tags
rather than silently used as hard filters — the anti-noise guard. 12 tests.
- geo::forward_geocode + bbox_to_circle: resolve a place name to a circle via
Nominatim /search, collapsing the bounding box to centroid + circumscribing
radius so "Portland" and "Italy" both map onto the existing gps circle
filter with no schema change. Radius is the max centroid-to-corner distance
(corners aren't equidistant on a sphere). 4 tests.
fmt + clippy clean; 19 new tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>