Cameron b9d5578653 feat(bins): multi-library populate_knowledge + progress UX
populate_knowledge now loads real libraries from the DB instead of
fabricating a single library_id=1 row from BASE_PATH. Adds --library
<id|name> to restrict the walk and validates --path against the selected
library roots. The full library set is still passed to InsightGenerator so
resolve_full_path can probe every root when an insight resolves to a
different library than the one being walked.

Adds indicatif progress bars across the long-running utility binaries via
a shared src/bin_progress.rs helper (determinate bar + open-ended spinner
with consistent styling). Per-batch info! noise is replaced by the bar's
throughput/ETA; warnings and errors route through pb.println so they
scroll above the bar instead of fighting with it.

  populate_knowledge   spinner during scan, determinate bar over all libs
  backfill_hashes      spinner with running hashed/missing/errors counts
  import_calendar      determinate bar; embedding/store failures inline
  import_location_*    determinate bar advancing by chunk size
  import_search_*      determinate bar; pb cloned into the spawn task
  cleanup_files P1     determinate bar over DB paths
  cleanup_files P2     determinate bar; pb.suspend() around y/n/a/s prompt

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 23:55:33 -04:00
2026-02-26 10:05:47 -05:00
2020-07-07 21:48:29 -04:00
2022-03-01 20:44:51 -05:00

Image API

This is an Actix-web server for serving images and videos from a filesystem. Upon first run it will generate thumbnails for all images and videos at BASE_PATH.

Features

  • Automatic thumbnail generation for images and videos
  • EXIF data extraction and storage for photos
  • File watching with NFS support (polling-based)
  • Video streaming with HLS
  • Tag-based organization
  • Memories API for browsing photos by date
  • Video Wall - Auto-generated short preview clips for videos, served via a grid view
  • AI-Powered Photo Insights - Generate contextual insights from photos using LLMs
  • RAG-based Context Retrieval - Semantic search over daily conversation summaries
  • Automatic Daily Summaries - LLM-generated summaries of daily conversations with embeddings

External Dependencies

ffmpeg (required)

ffmpeg must be on PATH. It is used for:

  • HLS video streaming — transcoding/segmenting source videos into .m3u8 + .ts playlists
  • Video thumbnails — extracting a frame at the 3-second mark
  • Video preview clips — short looping previews for the Video Wall
  • HEIC / HEIF thumbnails — decoding Apple's HEIC format (your ffmpeg build must include libheif; most modern builds do)

Builds used in development: the gyan.dev full build on Windows, and distro ffmpeg packages on Linux work fine. If HEIC thumbnails silently fail, check ffmpeg -formats | grep heif to confirm HEIF support.

RAW photo thumbnails (no extra dependency)

RAW formats (ARW, NEF, CR2, CR3, DNG, RAF, ORF, RW2, PEF, SRW, TIFF) are thumbnailed by reading the embedded JPEG preview from the TIFF IFD1 using kamadak-exif. No external RAW decoder (libraw / dcraw) is required. Files without an embedded preview fall back to ffmpeg (works for most NEF files), and anything that still can't be decoded is marked with a <thumb>.unsupported sentinel in the thumbnail directory so we don't retry it every scan. Delete those sentinels to force retries after a tooling upgrade.

Environment

There are a handful of required environment variables to have the API run. They should be defined where the binary is located or above it in an .env file.

  • DATABASE_URL is a path or url to a database (currently only SQLite is tested)
  • BASE_PATH is the root from which you want to serve images and videos
  • THUMBNAILS is a path where generated thumbnails should be stored. Thumbnails mirror the source tree under BASE_PATH and keep the source's original extension (e.g. foo.arw or bar.mp4), though the file contents are always JPEG bytes — browsers content-sniff. Files that can't be thumbnailed by the image crate, ffmpeg, or an embedded RAW preview get a zero-byte <thumb_path>.unsupported sentinel in this directory so subsequent scans skip them. Delete the *.unsupported files to force retries (for example after upgrading ffmpeg or adding libheif)
  • VIDEO_PATH is a path where HLS playlists and video parts should be stored
  • GIFS_DIRECTORY is a path where generated video GIF thumbnails should be stored
  • BIND_URL is the url and port to bind to (typically your own IP address)
  • SECRET_KEY is the hopefully random string to sign Tokens with
  • RUST_LOG is one of off, error, warn, info, debug, trace, from least to most noisy [error is default]
  • EXCLUDED_DIRS is a comma separated list of directories to exclude from the Memories API
  • PREVIEW_CLIPS_DIRECTORY (optional) is a path where generated video preview clips should be stored [default: preview_clips]
  • WATCH_QUICK_INTERVAL_SECONDS (optional) is the interval in seconds for quick file scans [default: 60]
  • WATCH_FULL_INTERVAL_SECONDS (optional) is the interval in seconds for full file scans [default: 3600]

AI Insights Configuration (Optional)

The following environment variables configure AI-powered photo insights and daily conversation summaries:

Ollama Configuration

  • OLLAMA_PRIMARY_URL - Primary Ollama server URL [default: http://localhost:11434]
    • Example: http://desktop:11434 (your main/powerful server)
  • OLLAMA_FALLBACK_URL - Fallback Ollama server URL (optional)
    • Example: http://server:11434 (always-on backup server)
  • OLLAMA_PRIMARY_MODEL - Model to use on primary server [default: nemotron-3-nano:30b]
    • Example: nemotron-3-nano:30b, llama3.2:3b, etc.
  • OLLAMA_FALLBACK_MODEL - Model to use on fallback server (optional)
    • If not set, uses OLLAMA_PRIMARY_MODEL on fallback server

Legacy Variables (still supported):

  • OLLAMA_URL - Used if OLLAMA_PRIMARY_URL not set
  • OLLAMA_MODEL - Used if OLLAMA_PRIMARY_MODEL not set

OpenRouter Configuration (Hybrid Backend)

The hybrid agentic backend keeps embeddings + vision local (Ollama) while routing chat + tool-calling to OpenRouter. Enabled per-request when the client sends backend=hybrid.

  • OPENROUTER_API_KEY - OpenRouter API key. Required to enable the hybrid backend.
  • OPENROUTER_DEFAULT_MODEL - Model id used when the client doesn't specify one [default: anthropic/claude-sonnet-4]
    • Example: openai/gpt-4o-mini, google/gemini-2.5-flash
  • OPENROUTER_ALLOWED_MODELS - Comma-separated curated allowlist exposed to clients via GET /insights/openrouter/models. The mobile picker shows only these. Empty/unset = no picker, server default is used.
    • Example: openai/gpt-4o-mini,anthropic/claude-haiku-4-5,google/gemini-2.5-flash
  • OPENROUTER_BASE_URL - Override base URL [default: https://openrouter.ai/api/v1]
  • OPENROUTER_EMBEDDING_MODEL - Embedding model for OpenRouter [default: openai/text-embedding-3-small]. Only used if/when embeddings are routed through OpenRouter (currently embeddings stay local).
  • OPENROUTER_HTTP_REFERER - Optional HTTP-Referer for OpenRouter attribution
  • OPENROUTER_APP_TITLE - Optional X-Title for OpenRouter attribution

Capability checks are skipped for the curated allowlist — bad model ids surface as a 4xx from the chat call. Pick tool-capable models.

SMS API Configuration

  • SMS_API_URL - URL to SMS message API [default: http://localhost:8000]
    • Used to fetch conversation data for context in insights
  • SMS_API_TOKEN - Authentication token for SMS API (optional)

Agentic Insight Generation

  • AGENTIC_MAX_ITERATIONS - Maximum tool-call iterations per agentic insight request [default: 10]
    • Controls how many times the model can invoke tools before being forced to produce a final answer
    • Increase for more thorough context gathering; decrease to limit response time

Insight Chat Continuation

After an agentic insight is generated, the conversation can be continued. Endpoints:

  • POST /insights/chat — single-turn reply (non-streaming)
  • POST /insights/chat/stream — SSE variant with live text deltas and tool_call / tool_result events. Mobile client uses this.
  • GET /insights/chat/history?path=...&library=... — rendered transcript; each assistant message carries a tools: [{name, arguments, result}] array
  • POST /insights/chat/rewind — truncate transcript at a rendered index (drops that message + any preceding tool scaffolding + later turns). Used for "try again from here" flows. The initial user message is protected.

Amend mode (amend: true in the chat request body) regenerates the insight's title and inserts a new row instead of appending to the existing transcript, so you can rewrite the saved summary from within chat.

  • AGENTIC_CHAT_MAX_ITERATIONS - Cap on tool-calling iterations per chat turn [default: 6]
    • Per-request max_iterations (when sent by the client) is clamped to this cap

Fallback Behavior

  • Primary server is tried first with 5-second connection timeout
  • On failure, automatically falls back to secondary server (if configured)
  • Total request timeout is 120 seconds to accommodate LLM inference
  • Logs indicate which server/model was used and any failover attempts

Daily Summary Generation

Daily conversation summaries are generated automatically on server startup. Configure in src/main.rs:

  • Date range for summary generation
  • Contacts to process
  • Model version used for embeddings: nomic-embed-text:v1.5
Description
A Rust actix based Image and Video Server.
Readme 10 MiB
Languages
Rust 97.3%
PowerShell 2.7%