Add Insights Model Discovery and Fallback Handling

This commit is contained in:
Cameron
2026-01-03 20:27:34 -05:00
parent 1171f19845
commit cf52d4ab76
10 changed files with 419 additions and 80 deletions

View File

@@ -250,8 +250,31 @@ Optional:
WATCH_QUICK_INTERVAL_SECONDS=60 # Quick scan interval
WATCH_FULL_INTERVAL_SECONDS=3600 # Full scan interval
OTLP_OTLS_ENDPOINT=http://... # OpenTelemetry collector (release builds)
# AI Insights Configuration
OLLAMA_PRIMARY_URL=http://desktop:11434 # Primary Ollama server (e.g., desktop)
OLLAMA_FALLBACK_URL=http://server:11434 # Fallback Ollama server (optional, always-on)
OLLAMA_PRIMARY_MODEL=nemotron-3-nano:30b # Model for primary server (default: nemotron-3-nano:30b)
OLLAMA_FALLBACK_MODEL=llama3.2:3b # Model for fallback server (optional, uses primary if not set)
SMS_API_URL=http://localhost:8000 # SMS message API endpoint (default: localhost:8000)
SMS_API_TOKEN=your-api-token # SMS API authentication token (optional)
```
**AI Insights Fallback Behavior:**
- Primary server is tried first with its configured model (5-second connection timeout)
- On connection failure, automatically falls back to secondary server with its model (if configured)
- If `OLLAMA_FALLBACK_MODEL` not set, uses same model as primary server on fallback
- Total request timeout is 120 seconds to accommodate slow LLM inference
- Logs indicate which server and model was used (info level) and failover attempts (warn level)
- Backwards compatible: `OLLAMA_URL` and `OLLAMA_MODEL` still supported as fallbacks
**Model Discovery:**
The `OllamaClient` provides methods to query available models:
- `OllamaClient::list_models(url)` - Returns list of all models on a server
- `OllamaClient::is_model_available(url, model_name)` - Checks if a specific model exists
This allows runtime verification of model availability before generating insights.
## Dependencies of Note
- **actix-web**: HTTP framework