OpenRouter Support, Insight Chat and User injection #56
Reference in New Issue
Block a user
Delete Branch "005-llm-client-trait"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Added the ability to wire up OpenRouter for Insights/Chat.
Allow for injecting a user name into the contexts for tool calling accuracy.
load_history now groups preceding tool_call + tool_result scaffolding under each assistant reply as `tools: [{name, arguments, result}]`. Result bodies over 2000 chars are truncated for payload size with a `result_truncated` flag; the full value remains in training_messages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>Switch the "Agentic tool call" log from {:?} (Debug) to {} (Display) on serde_json::Value. Display produces compact JSON — `{"date":"2023-08-15"}` instead of `Object {"date": String("2023-08-15")}` — which is what the model actually sent and what a human reading the log wants to see. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>- search_rag reranker now logs wall-clock time around the ollama.generate call, the candidate count / top-N going in, and the final reordering. The "final indices" + swap-count line is info level so it's always visible; detailed before/after previews stay at debug for when you want to inspect reranker quality. - New OllamaClient::generate_no_think convenience that sets Ollama's top-level think:false on the request, plumbed through try_generate via a new internal generate_with_options. Used only by the reranker today; avoids the chain-of-thought tax on reasoning models (Qwen3/VL, DeepSeek-R1 distills, GPT-OSS) when the task has nothing to reason about. Server-side no-op on non-reasoning models. - OpenRouter chat_with_tools "missing choices[0]" error now includes the actual response body — extracts structured {error: {code, message}} when OpenRouter surfaces it (common for upstream-provider issues like rate limits and content moderation), otherwise falls back to a truncated raw-JSON view. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>