AI: add enable_thinking reasoning toggle plumbed to llama.cpp

New optional SamplingOverride forwarded to llama-server as
chat_template_kwargs.enable_thinking (gates Qwen3-style reasoning
blocks). None leaves the template default; other backends ignore it.
Wired through the agentic-insight and chat-turn request bodies/handlers.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-06-17 18:14:44 -04:00
committed by Cameron
parent 7684220f52
commit 475072810e
8 changed files with 55 additions and 0 deletions
+1
View File
@@ -309,6 +309,7 @@ pub async fn generate_script_agentic(
top_p: None,
top_k: None,
min_p: None,
enable_thinking: None,
},
)
.await