Add reconnectable async chat-turn flow with in-memory TurnRegistry

Replace the one-shot SSE chat stream with an async dispatch + reconnectable
replay flow so the mobile client survives backgrounding, network blips, and
OS-killed sockets without losing an in-flight agentic turn.

- TurnRegistry/TurnEntry: in-memory per-turn event buffer (cap 500, front
  eviction) shared by the agentic loop (writer) and SSE replay readers.
  ReplayOutcome + replay_from/next_batch distinguish Events/CaughtUp/Gone;
  next_batch registers the Notify before reading state (no lost wakeup) and
  drains every buffered event before signaling terminal, so the final
  Done/Error is never dropped and the stream closes cleanly.
- Endpoints: POST /insights/chat/turn (202 + turn_id), GET
  /insights/chat/turn/{id} (SSE replay, ?skip_before= resume, per-event seq,
  410 on eviction), DELETE /insights/chat/turn/{id} (real task abort +
  cooperative is_running() check at each loop boundary).
- Cancellation actually stops the task (AbortHandle stored on the entry) and
  emits a Done{cancelled:true}; callers skip persistence on cancel.
- Background sweeper drops stale turns; interval clamped to <=300s.
- OpenTelemetry spans: ai.chat.turn.execute/replay/cancel.
- Legacy POST /insights/chat/stream path preserved unchanged.

Tests: registry coverage for terminal delivery (race guard), waiting, Gone,
abort, eviction; handler integration tests for 404/410, skip_before, seq
stamping, completed replay, and cancel.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-05-29 19:50:25 -04:00
parent 0c1c1c6792
commit 962f7bf05c
8 changed files with 1946 additions and 17 deletions
+4 -3
View File
@@ -11,6 +11,7 @@ pub mod llm_client;
pub mod ollama;
pub mod openrouter;
pub mod sms_client;
pub mod turn_registry;
// strip_summary_boilerplate is used by binaries (test_daily_summary), not the library
#[allow(unused_imports)]
@@ -19,11 +20,11 @@ pub use daily_summary_job::{
generate_daily_summaries, strip_summary_boilerplate,
};
pub use handlers::{
cancel_generation_handler, chat_history_handler, chat_rewind_handler, chat_stream_handler,
chat_turn_handler, delete_insight_handler, export_training_data_handler,
cancel_generation_handler, cancel_turn_handler, chat_history_handler, chat_rewind_handler,
chat_stream_handler, chat_turn_handler, delete_insight_handler, export_training_data_handler,
generate_agentic_insight_handler, generate_insight_handler, generation_status_handler,
get_all_insights_handler, get_available_models_handler, get_insight_handler,
get_openrouter_models_handler, rate_insight_handler,
get_openrouter_models_handler, rate_insight_handler, turn_async_handler, turn_replay_handler,
};
pub use insight_generator::InsightGenerator;
pub use llamacpp::LlamaCppClient;