Add Insights Model Discovery and Fallback Handling
This commit is contained in:
23
CLAUDE.md
23
CLAUDE.md
@@ -250,8 +250,31 @@ Optional:
|
||||
WATCH_QUICK_INTERVAL_SECONDS=60 # Quick scan interval
|
||||
WATCH_FULL_INTERVAL_SECONDS=3600 # Full scan interval
|
||||
OTLP_OTLS_ENDPOINT=http://... # OpenTelemetry collector (release builds)
|
||||
|
||||
# AI Insights Configuration
|
||||
OLLAMA_PRIMARY_URL=http://desktop:11434 # Primary Ollama server (e.g., desktop)
|
||||
OLLAMA_FALLBACK_URL=http://server:11434 # Fallback Ollama server (optional, always-on)
|
||||
OLLAMA_PRIMARY_MODEL=nemotron-3-nano:30b # Model for primary server (default: nemotron-3-nano:30b)
|
||||
OLLAMA_FALLBACK_MODEL=llama3.2:3b # Model for fallback server (optional, uses primary if not set)
|
||||
SMS_API_URL=http://localhost:8000 # SMS message API endpoint (default: localhost:8000)
|
||||
SMS_API_TOKEN=your-api-token # SMS API authentication token (optional)
|
||||
```
|
||||
|
||||
**AI Insights Fallback Behavior:**
|
||||
- Primary server is tried first with its configured model (5-second connection timeout)
|
||||
- On connection failure, automatically falls back to secondary server with its model (if configured)
|
||||
- If `OLLAMA_FALLBACK_MODEL` not set, uses same model as primary server on fallback
|
||||
- Total request timeout is 120 seconds to accommodate slow LLM inference
|
||||
- Logs indicate which server and model was used (info level) and failover attempts (warn level)
|
||||
- Backwards compatible: `OLLAMA_URL` and `OLLAMA_MODEL` still supported as fallbacks
|
||||
|
||||
**Model Discovery:**
|
||||
The `OllamaClient` provides methods to query available models:
|
||||
- `OllamaClient::list_models(url)` - Returns list of all models on a server
|
||||
- `OllamaClient::is_model_available(url, model_name)` - Checks if a specific model exists
|
||||
|
||||
This allows runtime verification of model availability before generating insights.
|
||||
|
||||
## Dependencies of Note
|
||||
|
||||
- **actix-web**: HTTP framework
|
||||
|
||||
Reference in New Issue
Block a user