AI: add enable_thinking reasoning toggle plumbed to llama.cpp
New optional SamplingOverride forwarded to llama-server as chat_template_kwargs.enable_thinking (gates Qwen3-style reasoning blocks). None leaves the template default; other backends ignore it. Wired through the agentic-insight and chat-turn request bodies/handlers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -309,6 +309,7 @@ pub async fn generate_script_agentic(
|
||||
top_p: None,
|
||||
top_k: None,
|
||||
min_p: None,
|
||||
enable_thinking: None,
|
||||
},
|
||||
)
|
||||
.await
|
||||
|
||||
Reference in New Issue
Block a user