Document the four new chat endpoints, SSE event shape, backend routing rules, rewind semantics, amend mode, and the AGENTIC_CHAT_MAX_ITERATIONS cap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
116 lines
6.4 KiB
Markdown
116 lines
6.4 KiB
Markdown
# Image API
|
|
This is an Actix-web server for serving images and videos from a filesystem.
|
|
Upon first run it will generate thumbnails for all images and videos at `BASE_PATH`.
|
|
|
|
## Features
|
|
- Automatic thumbnail generation for images and videos
|
|
- EXIF data extraction and storage for photos
|
|
- File watching with NFS support (polling-based)
|
|
- Video streaming with HLS
|
|
- Tag-based organization
|
|
- Memories API for browsing photos by date
|
|
- **Video Wall** - Auto-generated short preview clips for videos, served via a grid view
|
|
- **AI-Powered Photo Insights** - Generate contextual insights from photos using LLMs
|
|
- **RAG-based Context Retrieval** - Semantic search over daily conversation summaries
|
|
- **Automatic Daily Summaries** - LLM-generated summaries of daily conversations with embeddings
|
|
|
|
## Environment
|
|
There are a handful of required environment variables to have the API run.
|
|
They should be defined where the binary is located or above it in an `.env` file.
|
|
You must have `ffmpeg` installed for streaming video and generating video thumbnails.
|
|
|
|
- `DATABASE_URL` is a path or url to a database (currently only SQLite is tested)
|
|
- `BASE_PATH` is the root from which you want to serve images and videos
|
|
- `THUMBNAILS` is a path where generated thumbnails should be stored
|
|
- `VIDEO_PATH` is a path where HLS playlists and video parts should be stored
|
|
- `GIFS_DIRECTORY` is a path where generated video GIF thumbnails should be stored
|
|
- `BIND_URL` is the url and port to bind to (typically your own IP address)
|
|
- `SECRET_KEY` is the *hopefully* random string to sign Tokens with
|
|
- `RUST_LOG` is one of `off, error, warn, info, debug, trace`, from least to most noisy [error is default]
|
|
- `EXCLUDED_DIRS` is a comma separated list of directories to exclude from the Memories API
|
|
- `PREVIEW_CLIPS_DIRECTORY` (optional) is a path where generated video preview clips should be stored [default: `preview_clips`]
|
|
- `WATCH_QUICK_INTERVAL_SECONDS` (optional) is the interval in seconds for quick file scans [default: 60]
|
|
- `WATCH_FULL_INTERVAL_SECONDS` (optional) is the interval in seconds for full file scans [default: 3600]
|
|
|
|
### AI Insights Configuration (Optional)
|
|
|
|
The following environment variables configure AI-powered photo insights and daily conversation summaries:
|
|
|
|
#### Ollama Configuration
|
|
- `OLLAMA_PRIMARY_URL` - Primary Ollama server URL [default: `http://localhost:11434`]
|
|
- Example: `http://desktop:11434` (your main/powerful server)
|
|
- `OLLAMA_FALLBACK_URL` - Fallback Ollama server URL (optional)
|
|
- Example: `http://server:11434` (always-on backup server)
|
|
- `OLLAMA_PRIMARY_MODEL` - Model to use on primary server [default: `nemotron-3-nano:30b`]
|
|
- Example: `nemotron-3-nano:30b`, `llama3.2:3b`, etc.
|
|
- `OLLAMA_FALLBACK_MODEL` - Model to use on fallback server (optional)
|
|
- If not set, uses `OLLAMA_PRIMARY_MODEL` on fallback server
|
|
|
|
**Legacy Variables** (still supported):
|
|
- `OLLAMA_URL` - Used if `OLLAMA_PRIMARY_URL` not set
|
|
- `OLLAMA_MODEL` - Used if `OLLAMA_PRIMARY_MODEL` not set
|
|
|
|
#### OpenRouter Configuration (Hybrid Backend)
|
|
The hybrid agentic backend keeps embeddings + vision local (Ollama) while routing
|
|
chat + tool-calling to OpenRouter. Enabled per-request when the client sends
|
|
`backend=hybrid`.
|
|
|
|
- `OPENROUTER_API_KEY` - OpenRouter API key. Required to enable the hybrid backend.
|
|
- `OPENROUTER_DEFAULT_MODEL` - Model id used when the client doesn't specify one
|
|
[default: `anthropic/claude-sonnet-4`]
|
|
- Example: `openai/gpt-4o-mini`, `google/gemini-2.5-flash`
|
|
- `OPENROUTER_ALLOWED_MODELS` - Comma-separated curated allowlist exposed to
|
|
clients via `GET /insights/openrouter/models`. The mobile picker shows only
|
|
these. Empty/unset = no picker, server default is used.
|
|
- Example: `openai/gpt-4o-mini,anthropic/claude-haiku-4-5,google/gemini-2.5-flash`
|
|
- `OPENROUTER_BASE_URL` - Override base URL [default: `https://openrouter.ai/api/v1`]
|
|
- `OPENROUTER_EMBEDDING_MODEL` - Embedding model for OpenRouter
|
|
[default: `openai/text-embedding-3-small`]. Only used if/when embeddings are
|
|
routed through OpenRouter (currently embeddings stay local).
|
|
- `OPENROUTER_HTTP_REFERER` - Optional `HTTP-Referer` for OpenRouter attribution
|
|
- `OPENROUTER_APP_TITLE` - Optional `X-Title` for OpenRouter attribution
|
|
|
|
Capability checks are skipped for the curated allowlist — bad model ids surface
|
|
as a 4xx from the chat call. Pick tool-capable models.
|
|
|
|
#### SMS API Configuration
|
|
- `SMS_API_URL` - URL to SMS message API [default: `http://localhost:8000`]
|
|
- Used to fetch conversation data for context in insights
|
|
- `SMS_API_TOKEN` - Authentication token for SMS API (optional)
|
|
|
|
#### Agentic Insight Generation
|
|
- `AGENTIC_MAX_ITERATIONS` - Maximum tool-call iterations per agentic insight request [default: `10`]
|
|
- Controls how many times the model can invoke tools before being forced to produce a final answer
|
|
- Increase for more thorough context gathering; decrease to limit response time
|
|
|
|
#### Insight Chat Continuation
|
|
After an agentic insight is generated, the conversation can be continued. Endpoints:
|
|
- `POST /insights/chat` — single-turn reply (non-streaming)
|
|
- `POST /insights/chat/stream` — SSE variant with live `text` deltas and
|
|
`tool_call` / `tool_result` events. Mobile client uses this.
|
|
- `GET /insights/chat/history?path=...&library=...` — rendered transcript;
|
|
each assistant message carries a `tools: [{name, arguments, result}]` array
|
|
- `POST /insights/chat/rewind` — truncate transcript at a rendered index
|
|
(drops that message + any preceding tool scaffolding + later turns). Used
|
|
for "try again from here" flows. The initial user message is protected.
|
|
|
|
Amend mode (`amend: true` in the chat request body) regenerates the insight's
|
|
title and inserts a new row instead of appending to the existing transcript,
|
|
so you can rewrite the saved summary from within chat.
|
|
|
|
- `AGENTIC_CHAT_MAX_ITERATIONS` - Cap on tool-calling iterations per chat turn [default: `6`]
|
|
- Per-request `max_iterations` (when sent by the client) is clamped to this cap
|
|
|
|
#### Fallback Behavior
|
|
- Primary server is tried first with 5-second connection timeout
|
|
- On failure, automatically falls back to secondary server (if configured)
|
|
- Total request timeout is 120 seconds to accommodate LLM inference
|
|
- Logs indicate which server/model was used and any failover attempts
|
|
|
|
#### Daily Summary Generation
|
|
Daily conversation summaries are generated automatically on server startup. Configure in `src/main.rs`:
|
|
- Date range for summary generation
|
|
- Contacts to process
|
|
- Model version used for embeddings: `nomic-embed-text:v1.5`
|
|
|