Commit Graph

23 Commits

Author SHA1 Message Date
Cameron
fa21b0d73d chore(ai): disable default few-shot insight ids
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 19:12:25 -04:00
Cameron
f0ae9f95dc feat(ai): few-shot exemplars + sticky Ollama preference
- Few-shot injection on /insights/generate/agentic: compresses prior
  training_messages into trajectory blocks (tool calls + result summaries)
  and injects into the system prompt. Hardcoded default ids with optional
  request override.
- New fewshot_source_ids column on photo_insights (+ migration) to track
  which exemplars influenced a given row, for downstream training-set
  filtering. Chat amend rows stamp None with a lineage note.
- Ollama client now remembers which server (primary/fallback) most
  recently succeeded and tries it first on the next call, via a shared
  Arc<AtomicBool>. Avoids re-404ing the primary on every agent iteration
  when the chosen model only lives on the fallback.
- Demote noisy logs: daily_summary "Summary match" lines to debug;
  inner chat_with_tools non-2xx body log from error to warn (outer
  layer owns the terminal-error signal).
- Drift-guard tests for summarize_tool_result covering the success /
  empty / error / unknown shape for every tool.
- Tidy: three pre-existing clippy warnings cleaned up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:54:06 -04:00
Cameron
079cd4c5b9 feat(ai): streaming chat endpoint with live tool events
Add LlmClient::chat_with_tools_stream and SSE endpoint
POST /insights/chat/stream that emits text deltas, tool_call /
tool_result pairs, truncated notice, and a terminal done frame as the
agentic loop runs.

- Ollama: parses NDJSON from /api/chat stream, accumulates content
  deltas, emits Done with tool_calls from the final chunk.
- OpenRouter: parses OpenAI-compatible SSE, reassembles tool_call
  argument deltas by index, asks for stream_options.include_usage.
- InsightChatService spawns the loop on a tokio task, feeds events
  through an mpsc channel, persists training_messages at the end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 16:57:41 -04:00
Cameron
c2bd3c08e1 feat(ai): surface tool invocations in chat history
load_history now groups preceding tool_call + tool_result scaffolding
under each assistant reply as `tools: [{name, arguments, result}]`.
Result bodies over 2000 chars are truncated for payload size with a
`result_truncated` flag; the full value remains in training_messages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 16:03:53 -04:00
Cameron
65ab10e9a8 feat(ai): chat rewind + ollama metrics logging
Rewind: POST /insights/chat/rewind truncates training_messages at a
given rendered index, dropping the target message plus any preceding
tool-call scaffolding. The initial user prompt is protected.

Metrics: log prompt_eval_count/duration and eval_count/duration from
every Ollama chat response, rendered as tokens + ms + tok/s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 15:16:32 -04:00
Cameron
0b9528f61e feat(ai): chat continuation for photo insights (server v1)
Adds POST /insights/chat and GET /insights/chat/history. Replays the
stored agentic conversation through the same backend the insight was
generated with (or a per-turn override), runs a short tool-calling
loop, and persists the extended history in append or amend mode.

Backend switching: same-backend or hybrid->local replay verbatim;
local->hybrid is rejected in v1 (would require on-the-fly vision
description rewrite).

Per-(library, file) async mutex serialises concurrent turns. Soft
context budget drops oldest tool_call+result pairs when the
serialized history exceeds num_ctx - 2048 tokens.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 13:00:27 -04:00
Cameron
e2eefbd156 feat(ai): curated OpenRouter model picker for hybrid backend
Add OPENROUTER_ALLOWED_MODELS env var and GET /insights/openrouter/models
endpoint returning the curated list verbatim. Drop the live capability
precheck in hybrid mode — trust the operator's allowlist; bad ids surface
as a chat-call error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:36:19 -04:00
Cameron
3ac0cd62eb feat(ai): hybrid backend mode for agentic insights
Adds a `backend` column to photo_insights (default 'local', migration
2026-04-20-000000) and a corresponding optional `backend` field on the
agentic request. When a request sets backend=hybrid:

- The local Ollama vision model is called once via describe_image to
  produce a text description.
- The description is inlined into the first user message as text —
  no base64 image is ever sent to the chat model.
- The agentic tool-calling loop and title generation route through an
  OpenRouterClient (dispatched via &dyn LlmClient), letting the user
  pick any tool-capable model from OpenRouter per request.
- describe_photo is removed from the offered tools since the description
  is already present.

Embeddings and vision stay on local Ollama regardless of backend.
Hybrid mode requires OPENROUTER_API_KEY; handlers return a clear error
when hybrid is requested without it, and also when the selected
OpenRouter model lacks tool-calling support.

AppState gains an optional openrouter client built from
OPENROUTER_API_KEY / OPENROUTER_BASE_URL / OPENROUTER_DEFAULT_MODEL /
OPENROUTER_EMBEDDING_MODEL / attribution headers. Default model is
anthropic/claude-sonnet-4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 22:30:40 -04:00
Cameron
c2ee3996be chore: apply cargo fmt + clippy cleanup across crate
Silence forward-looking dead_code on unused DAO modules, annotate
individual placeholder items, rewrite tautological assert!(true/false)
in token tests as panic! arms, and pick up fmt drift.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
2d942a9926 feat: content-hash-aware tag/insight sharing + library scoping
Tags and insights now follow content across libraries via content_hash
lookups on the read path, so the same file indexed at different rel_paths
in multiple libraries shares its annotations. Recursive tag search scopes
hits to the selected library by checking each tagged rel_path against
the library's disk (with a content-hash sibling fallback so tags attached
under one library's rel_path still match a content-equivalent file in
another). The /image and /image/metadata handlers fall back across
libraries when the file isn't under the resolved one, so union-mode
search results (which carry no library attribution in the response)
still serve correctly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
b599f7a34b feat: add temperature, top_p, top_k, min_p params to insight generation
Expose Ollama sampling params through the insight generation endpoints
so users can tune creativity/determinism per request. All four are
optional — omitted values fall through to the model's server-side
defaults.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 09:27:59 -04:00
Cameron
c703a47f17 Add the ability to rate insights to curate training data 2026-04-13 09:23:40 -04:00
Cameron
e1c32b6584 Tweak Prompt 2026-04-10 14:30:31 -04:00
Cameron
b2cf99c857 feat: surface Ollama context token usage in agentic insight response
Captures prompt_eval_count and eval_count from Ollama /api/chat responses
during the agentic loop and returns them in POST /insights/generate/agentic
so the frontend can display context window usage to the user.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 17:25:35 -04:00
Cameron
54a49a8562 fix: agentic loop robustness — tool arg sanitisation, geocoding, better errors
- Sanitise tool call arguments before re-sending in conversation history: non-object values (bool, string, null) that some models produce are normalised to {} to prevent Ollama 500s
- Map 'error parsing tool call' Ollama 500 to HTTP 400 with a descriptive message listing compatible models (llama3.1, llama3.2, qwen2.5, mistral-nemo)
- Add reverse_geocode tool backed by existing Nominatim helper; description hints model can chain it after get_location_history results
- Make get_sms_messages contact parameter optional (was required, forcing the model to guess); executor now passes None to fall back to all-contacts search
- Log tool result outcomes at warn level for errors/empty results, info for successes; log SMS API errors with full detail; log full request body on Ollama 500
- Strengthen system prompt to require 3-4 tool calls before final answer
- Try fallback server when checking model capabilities (primary-only check caused 500 for models only on fallback)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 23:58:01 -04:00
Cameron
091327e5d9 feat: add POST /insights/generate/agentic handler and route
Register the agentic insight endpoint that validates tool-calling capability,
runs the agentic loop, and returns the stored PhotoInsightResponse. Returns 400
for unsupported models, 500 for other errors. Max iterations configurable via
AGENTIC_MAX_ITERATIONS env var (default 10).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 23:01:25 -04:00
Cameron
f65f4efde8 Make date parse from metadata a little more consistent 2026-01-14 12:54:36 -05:00
Cameron
ad0bba63b4 Add check for vision capabilities 2026-01-11 15:22:24 -05:00
Cameron
b2cc617bc2 Pass image as additional Insight context 2026-01-10 11:30:01 -05:00
Cameron
bb23e6bb25 Cargo fix 2026-01-05 10:31:34 -05:00
Cameron
11e725c443 Enhanced Insights with daily summary embeddings
Bump to 0.5.0. Added daily summary generation job
2026-01-05 09:13:16 -05:00
Cameron
cf52d4ab76 Add Insights Model Discovery and Fallback Handling 2026-01-03 20:27:34 -05:00
Cameron
1171f19845 Create Insight Generation Feature
Added integration with Messages API and Ollama
2026-01-03 10:30:37 -05:00