feat(ai): chat rewind + ollama metrics logging

Rewind: POST /insights/chat/rewind truncates training_messages at a
given rendered index, dropping the target message plus any preceding
tool-call scaffolding. The initial user prompt is protected.

Metrics: log prompt_eval_count/duration and eval_count/duration from
every Ollama chat response, rendered as tokens + ms + tok/s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cameron
2026-04-21 15:16:32 -04:00
parent 0b9528f61e
commit 65ab10e9a8
5 changed files with 270 additions and 4 deletions

View File

@@ -11,10 +11,10 @@ pub mod sms_client;
#[allow(unused_imports)]
pub use daily_summary_job::{generate_daily_summaries, strip_summary_boilerplate};
pub use handlers::{
chat_history_handler, chat_turn_handler, delete_insight_handler, export_training_data_handler,
generate_agentic_insight_handler, generate_insight_handler, get_all_insights_handler,
get_available_models_handler, get_insight_handler, get_openrouter_models_handler,
rate_insight_handler,
chat_history_handler, chat_rewind_handler, chat_turn_handler, delete_insight_handler,
export_training_data_handler, generate_agentic_insight_handler, generate_insight_handler,
get_all_insights_handler, get_available_models_handler, get_insight_handler,
get_openrouter_models_handler, rate_insight_handler,
};
pub use insight_generator::InsightGenerator;
#[allow(unused_imports)]