feat(ai): chat rewind + ollama metrics logging
Rewind: POST /insights/chat/rewind truncates training_messages at a given rendered index, dropping the target message plus any preceding tool-call scaffolding. The initial user prompt is protected. Metrics: log prompt_eval_count/duration and eval_count/duration from every Ollama chat response, rendered as tokens + ms + tok/s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1358,6 +1358,7 @@ fn main() -> std::io::Result<()> {
|
||||
.service(ai::get_openrouter_models_handler)
|
||||
.service(ai::chat_turn_handler)
|
||||
.service(ai::chat_history_handler)
|
||||
.service(ai::chat_rewind_handler)
|
||||
.service(ai::rate_insight_handler)
|
||||
.service(ai::export_training_data_handler)
|
||||
.service(libraries::list_libraries)
|
||||
|
||||
Reference in New Issue
Block a user