Adds a nullable comma-separated TEXT column to the libraries table.
Effective excludes for a walk = (env-var globals) ∪
(library.excluded_dirs). Empty / NULL = no library-specific
extras; the global env var still applies.
Migration (2026-05-01-110000_libraries_excluded_dirs)
ALTER TABLE libraries ADD COLUMN excluded_dirs TEXT. NULL on every
existing row — no behavior change on upgrade.
Library struct + helpers (libraries.rs)
- Library gains excluded_dirs: Vec<String>, parsed from the column
by parse_excluded_dirs_column (drops empties / whitespace,
matches the env-var parser).
- Library::effective_excluded_dirs(globals) returns the union.
- From<LibraryRow> hydrates the field on AppState construction so
/libraries surfaces it.
Watcher / walkers / memories
Every per-library walker now consults the effective set:
- process_new_files (file-watch ingest, RAW/EXIF/face)
- process_face_backlog (filter_excluded inherits)
- create_thumbnails (startup + new-file branch)
- update_media_counts (Prometheus gauge)
- cleanup_orphaned_playlists (per-library source-existence check)
- memories endpoint (PathExcluder)
Effective set is computed once per per-library iteration in the
watcher tick and threaded through; called functions retain their
flat &[String] signature (no per-library awareness needed inside
the walker primitives).
Use case: mount a parent directory while a sibling library covers
a child subtree, and exclude the child subtree from the parent so
the libraries don't double-walk / double-write image_exif. With
hash-keyed derived data (Branches B/C), the duplication-avoidance
is the only cost prevented — face / tag / insight sharing was
already correct via content_hash.
Tests: 228 pass (226 from previous + 2 new in libraries::tests:
parse_excluded_dirs_column edge cases,
effective_excluded_dirs_unions_global_and_per_library).
CLAUDE.md gains a "Per-library excludes" subsection of the
multi-library data model.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
700 lines
38 KiB
Markdown
700 lines
38 KiB
Markdown
# CLAUDE.md
|
||
|
||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||
|
||
## Project Overview
|
||
|
||
An Actix-web REST API for serving images and videos from a filesystem with automatic thumbnail generation, EXIF extraction, tag organization, and a memories feature for browsing photos by date. Uses SQLite/Diesel ORM for data persistence and ffmpeg for video processing.
|
||
|
||
## Development Commands
|
||
|
||
### Building & Running
|
||
```bash
|
||
# Build for development
|
||
cargo build
|
||
|
||
# Build for release (uses thin LTO optimization)
|
||
cargo build --release
|
||
|
||
# Run the server (requires .env file with DATABASE_URL, BASE_PATH, THUMBNAILS, VIDEO_PATH, BIND_URL, SECRET_KEY)
|
||
cargo run
|
||
|
||
# Run with specific log level
|
||
RUST_LOG=debug cargo run
|
||
```
|
||
|
||
### Testing
|
||
```bash
|
||
# Run all tests (requires BASE_PATH in .env)
|
||
cargo test
|
||
|
||
# Run specific test
|
||
cargo test test_name
|
||
|
||
# Run tests with output
|
||
cargo test -- --nocapture
|
||
```
|
||
|
||
### Database Migrations
|
||
```bash
|
||
# Install diesel CLI (one-time setup)
|
||
cargo install diesel_cli --no-default-features --features sqlite
|
||
|
||
# Create new migration
|
||
diesel migration generate migration_name
|
||
|
||
# Run migrations (also runs automatically on app startup)
|
||
diesel migration run
|
||
|
||
# Revert last migration
|
||
diesel migration revert
|
||
|
||
# Regenerate schema.rs after manual migration changes
|
||
diesel print-schema > src/database/schema.rs
|
||
```
|
||
|
||
### Code Quality
|
||
```bash
|
||
# Format code
|
||
cargo fmt
|
||
|
||
# Run clippy linter
|
||
cargo clippy
|
||
|
||
# Fix automatically fixable issues
|
||
cargo fix
|
||
```
|
||
|
||
### Utility Binaries
|
||
```bash
|
||
# Two-phase cleanup: resolve missing files and validate file types
|
||
cargo run --bin cleanup_files -- --base-path /path/to/media --database-url ./database.db
|
||
```
|
||
|
||
## Architecture Overview
|
||
|
||
### Core Components
|
||
|
||
**Layered Architecture:**
|
||
- **HTTP Layer** (`main.rs`): Route handlers for images, videos, metadata, tags, favorites, memories
|
||
- **Auth Layer** (`auth.rs`): JWT token validation, Claims extraction via FromRequest trait
|
||
- **Service Layer** (`files.rs`, `exif.rs`, `memories.rs`): Business logic for file operations and EXIF extraction
|
||
- **DAO Layer** (`database/mod.rs`): Trait-based data access (ExifDao, UserDao, FavoriteDao, TagDao)
|
||
- **Database Layer**: Diesel ORM with SQLite, schema in `database/schema.rs`
|
||
|
||
**Async Actor System (Actix):**
|
||
- `StreamActor`: Manages ffmpeg video processing lifecycle
|
||
- `VideoPlaylistManager`: Scans directories and queues videos
|
||
- `PlaylistGenerator`: Creates HLS playlists for video streaming
|
||
|
||
### Database Schema & Patterns
|
||
|
||
**Tables:**
|
||
- `users`: Authentication (id, username, password_hash)
|
||
- `favorites`: User-specific favorites (userid, path)
|
||
- `tags`: Custom labels with timestamps
|
||
- `tagged_photo`: Many-to-many photo-tag relationships
|
||
- `image_exif`: Rich metadata (file_path + 16 EXIF fields: camera, GPS, dates, exposure settings)
|
||
|
||
**DAO Pattern:**
|
||
All database access goes through trait-based DAOs (e.g., `ExifDao`, `SqliteExifDao`). Connection pooling uses `Arc<Mutex<SqliteConnection>>`. All DB operations are traced with OpenTelemetry in release builds.
|
||
|
||
**Key DAO Methods:**
|
||
- `store_exif()`, `get_exif()`, `get_exif_batch()`: EXIF CRUD operations
|
||
- `query_by_exif()`: Complex filtering by camera, GPS bounds, date ranges
|
||
- Batch operations minimize DB hits during file watching
|
||
|
||
### Multi-library data model
|
||
|
||
ImageApi supports more than one library (a library = a `(name, root_path)`
|
||
row in the `libraries` table that maps to a mounted directory tree). The
|
||
same bytes may exist under more than one library — typical case is an
|
||
"active" library plus an "archive" library that ingests files as they age
|
||
out — and the data model is designed so that derived data follows the
|
||
**bytes**, not the path, while user-managed data does the same.
|
||
|
||
**The principle.** A photo's identity is its `content_hash` (blake3, see
|
||
`src/content_hash.rs`). Anything we compute from or attach to a photo is
|
||
keyed on that hash so it survives:
|
||
- the same file appearing in a second library (backup / archive / mirror),
|
||
- the file moving between libraries (recent → archive handoff),
|
||
- the file moving within a library (re-organized rel_path),
|
||
- intra-library duplicates (same bytes at two paths).
|
||
|
||
**Table classification.** Three categories drive the keying decision:
|
||
|
||
| Category | Key | Rationale | Tables |
|
||
|---|---|---|---|
|
||
| Intrinsic to bytes | `content_hash` | Rerunning is wasted work (or LLM cost) | `face_detections` ✓, `image_exif` (target), `photo_insights` (target), `video_preview_clips` (target) |
|
||
| User intent about a photo | `content_hash` | "Tag this photo" means the bytes, not a path | `tagged_photo` (target), `favorites` (target) |
|
||
| Library administrative | `(library_id, rel_path)` | Tied to a specific filesystem location | `libraries`, `entity_photo_links`, the `rel_path` back-ref columns on hash-keyed tables |
|
||
|
||
✓ = already implemented this way. *(target)* = today still keyed on
|
||
`(library_id, rel_path)` and slated for migration. The migration adds a
|
||
nullable `content_hash` column, populates it from `image_exif` where
|
||
known, and read paths fall back to rel_path while the hash is null.
|
||
|
||
**Carrying a `rel_path` even when hash-keyed.** Hash-keyed tables retain
|
||
`(library_id, rel_path)` columns as a denormalized **back-reference**, not
|
||
as the key. This lets a single query answer "what is at this path right
|
||
now" without joining through `image_exif`, and supports the path-only
|
||
endpoints that predate the hash. `face_detections` is the reference
|
||
implementation: hash is the truth, path is a hint.
|
||
|
||
**Merge semantics on read.** When the same hash has rows under more than
|
||
one library:
|
||
- Set-valued data (tags, favorites, faces, entity links) → **union**.
|
||
- Scalar data (current insight, EXIF row, video preview clip) → earliest
|
||
`generated_at` / `created_time` wins. The historical lib1 row beats a
|
||
re-generated lib2 row, so the user's curated insight isn't shadowed by
|
||
a re-run on archive ingest.
|
||
|
||
**Write attribution.** A new tag/favorite/insight created while viewing
|
||
under lib2 binds to the bytes, not to lib2 — so it shows up under lib1
|
||
too. This is by design, but it's the most surprising rule on first
|
||
encounter; clients should not assume tags are library-scoped.
|
||
|
||
**Hash-less rows (transitional state).** During and immediately after a
|
||
new mount, `image_exif.content_hash` is being populated by
|
||
`backfill_unhashed_backlog` (capped per tick). Rules during this window:
|
||
- Writes: if the hash is known, write hash-keyed. If not, write
|
||
`(library_id, rel_path)`-keyed and let the reconciliation job collapse
|
||
duplicates once the hash lands.
|
||
- Reads: prefer hash key, fall back to `(library_id, rel_path)`.
|
||
- Reconciliation: a one-shot pass after every backfill tick collapses
|
||
rows that now share a hash, applying the merge semantics above.
|
||
Idempotent — safe to re-run.
|
||
|
||
**Library handoff (recent → archive).** When a file moves between
|
||
libraries (e.g. operator moves `~/photos/2024/IMG.nef` to the archive
|
||
mount), the file watcher sees the disappearance under lib1 and the
|
||
appearance under lib2. Hash-keyed rows don't need migration; the
|
||
`(library_id, rel_path)` back-ref columns are updated to point to the new
|
||
location. Library administrative rows (`entity_photo_links`,
|
||
`(library_id, rel_path)` rows in `image_exif` for hash-less items) are
|
||
re-keyed by the move detector, which matches a disappearance to an
|
||
appearance by `content_hash` within a configurable window.
|
||
|
||
**Orphans (source deleted while a copy survives).** When the only
|
||
`image_exif` row for a hash is deleted (file removed from disk), the
|
||
hash-keyed derived rows survive **as long as another `image_exif` row
|
||
references the same hash**. If the last reference is gone, derived rows
|
||
are eligible for GC (deferred — the GC job runs on a slow schedule so
|
||
that a brief unmount or rename doesn't wipe history).
|
||
|
||
**Stats and counts.** When reporting "how many photos do you have," count
|
||
`DISTINCT content_hash` over `image_exif`, not row count. Faces stats
|
||
already does this (`FaceDao::stats` in `src/faces.rs`); other counters
|
||
should follow suit. Numerator and denominator must live in the same
|
||
domain — see the face-stats commentary below for the cautionary tale.
|
||
|
||
**Per-library scoping when the user asks for it.** A request scoped to
|
||
`?library=N` filters the `image_exif` view to that library, and the
|
||
hash-keyed derived data is joined through that view. The user sees only
|
||
photos that have a copy under lib N, but the derived data attached to
|
||
those photos is the merged hash-keyed view. This is the answer to "show
|
||
me archive photos with their original tags."
|
||
|
||
**Operator kill switch (`libraries.enabled`).** Setting `enabled=0` on a
|
||
library is a hard pause: the watcher skips it entirely — before the
|
||
probe, before ingest, before any maintenance pass — and the orphan-GC
|
||
all-online consensus check filters disabled libraries out (they don't
|
||
keep the GC window closed). Reads / serving are unaffected; nothing
|
||
prevents `/image?path=...` from resolving against a disabled library's
|
||
root if the file is on disk. The existing `image_exif` rows for a
|
||
disabled library are **not deleted** — they continue to anchor
|
||
hash-keyed derived data, so cross-library duplicates survive the
|
||
disable. Toggle via SQL; there is intentionally no HTTP endpoint for
|
||
library mutation (single-user tool, no role / permission story).
|
||
Typical workflows: stage a new mount with `enabled=0` then flip to `1`;
|
||
quiet a flaky NAS during maintenance without disturbing the rest of
|
||
the system.
|
||
|
||
**Per-library excludes (`libraries.excluded_dirs`).** A
|
||
comma-separated column, same shape as the global `EXCLUDED_DIRS` env
|
||
var, that's applied **in union** with the env-var globals when a
|
||
walker scans this library. Use case: mount a parent directory as a
|
||
new library while a sibling library covers a child subtree, and
|
||
exclude that child subtree from the parent so the two libraries
|
||
don't double-walk and double-write `image_exif`. Hash-keyed derived
|
||
data (faces, tags, insights) is unaffected either way — those
|
||
follow the bytes — but `image_exif` row count, walker CPU, and
|
||
thumbnail disk usage all drop to 1× instead of 2× for the overlap.
|
||
Affects: file-watch ingest (`process_new_files`), thumbnail
|
||
generation, media-count gauges, the orphaned-playlist cleanup walk,
|
||
and the `/memories` endpoint. The face-detection backlog drain
|
||
inherits via `face_watch::filter_excluded`. NULL = no extras (only
|
||
the global env var applies).
|
||
|
||
**Library availability and safety.** Libraries can be on network shares
|
||
or removable media; the file watcher must not interpret a temporary
|
||
unavailability as a mass-deletion event. Every tick begins with a
|
||
**presence probe** per library: the library is considered online iff
|
||
its `root_path` exists, is readable, and a top-level scan returns at
|
||
least one expected entry (or matches a recent file-count high-water
|
||
mark within a tolerance). The probe result gates which actions are safe
|
||
to run on that library this tick:
|
||
|
||
| Action | Requires online? |
|
||
|---|---|
|
||
| Quick / full scan ingest of new files | yes |
|
||
| EXIF / face / insight backlog drains | yes — but the work runs against any online library |
|
||
| Move-handoff detection (lib1 disappearance ↔ lib2 appearance match) | **both** libraries online |
|
||
| `(library_id, rel_path)` re-keying on detected move | **both** libraries online |
|
||
| Orphan GC of hash-keyed derived data | all libraries that have *ever* held the hash must be online and confirmed-clean for two consecutive ticks |
|
||
| Reads / serving | always allowed; falls back to whichever library is online |
|
||
|
||
A library that fails the probe enters a "stale" state: writes scoped to
|
||
it are paused, its rows are flagged stale (not deleted) in
|
||
`/libraries` status, and the watcher logs at `warn` once per
|
||
state-transition (not per tick). A library that recovers re-enters the
|
||
online set automatically; no operator action required for transient
|
||
outages. The intent is that pulling a USB drive, rebooting a NAS, or
|
||
losing a VPN never triggers a destructive code path — the worst case is
|
||
that derived-data work pauses until the share returns.
|
||
|
||
The same rule constrains the move-handoff matcher: a disappearance
|
||
under lib1 only counts as a "move" if there is a matching appearance
|
||
under another **online** library within the window. A bare
|
||
disappearance with no matching appearance is treated as
|
||
"unavailable-or-deleted, defer judgment" — it does not re-key any rows
|
||
and does not enqueue GC.
|
||
|
||
**Maintenance pipeline (`src/library_maintenance.rs`).** The watcher
|
||
runs three maintenance passes per tick that together implement the
|
||
move/handoff and orphan rules:
|
||
|
||
1. **Missing-file scan** — per online library, paginated. A page of
|
||
`image_exif` rows is loaded (`IMAGE_EXIF_MISSING_SCAN_PAGE_SIZE`,
|
||
default 500), each row's `(root_path/rel_path)` is `stat()`-ed,
|
||
and confirmed-not-found rows are deleted from `image_exif`
|
||
(capped at `IMAGE_EXIF_MISSING_DELETE_CAP_PER_TICK`, default 200).
|
||
Permission/IO errors are skipped, never deleted — only `NotFound`
|
||
triggers a deletion. The cursor wraps every time a partial page
|
||
comes back, so the whole library is swept across consecutive ticks.
|
||
Skipped wholesale for Stale libraries via the per-library probe
|
||
gate at the top of the loop iteration.
|
||
|
||
2. **Back-ref refresh** — DB-only. For `face_detections`,
|
||
`tagged_photo`, and `photo_insights`: any hash-keyed row whose
|
||
`(library_id, rel_path)` no longer matches an `image_exif` row
|
||
*but whose `content_hash` does* is repointed at the surviving
|
||
`image_exif` location. Idempotent SQL; no health gate needed.
|
||
This is what makes the recent → archive handoff invisible to
|
||
read paths: when the missing-file scan retires the lib-A row,
|
||
tags/faces/insights pivot to lib-B's path before any user
|
||
notices.
|
||
|
||
3. **Orphan GC** — destructive. Hash-keyed derived rows whose
|
||
`content_hash` no longer has any `image_exif` row are eligible.
|
||
Two-tick consensus: a hash must be observed orphaned on two
|
||
consecutive ticks AND every library must be online for both. A
|
||
single Stale tick within the window cancels all pending deletes.
|
||
The pending set is held in memory (`OrphanGcState`) — restart
|
||
resets it, which only delays a delete, never causes one. Tags,
|
||
faces, and insights for orphaned hashes are deleted in one batch
|
||
per tick.
|
||
|
||
A backup library that briefly disappears, then returns within two
|
||
ticks, never loses any derived data. A move from lib-A to lib-B
|
||
without disappearance flips through pass 1 (lib-A row retired) and
|
||
pass 2 (back-refs follow), with pass 3 noting nothing because the
|
||
hash is still present in `image_exif` (lib-B's row).
|
||
|
||
**Known gap: in-place content changes (future Branch D).** The
|
||
maintenance pipeline assumes a `(library_id, rel_path)`'s bytes are
|
||
stable for as long as the file exists at that path. If a user edits
|
||
a file in place (crop, re-export) without renaming, the watcher's
|
||
quick scan walks the file (mtime is recent) but `process_new_files`
|
||
short-circuits because `(library_id, rel_path)` already has an
|
||
`image_exif` row — no re-hash, no re-EXIF, no face redetection. The
|
||
row's `content_hash` keeps pointing at the original bytes. Tags /
|
||
faces / insights stay attached to the original hash and continue to
|
||
display because the rel_path back-ref still resolves; new faces
|
||
introduced by the edit are never detected.
|
||
|
||
The right place to fix this is a **stale-content detection pass**
|
||
that compares `image_exif.last_modified` / `size_bytes` to
|
||
`fs::metadata` for rows the quick scan would otherwise skip. On
|
||
mismatch, recompute the hash, update `image_exif`, and apply the
|
||
"content branched" semantics:
|
||
- **Faces** re-run (faces are fully derived from bytes).
|
||
- **Tags** migrate to the new hash (user intent — "this photo is
|
||
vacation" survives a crop). Insights migrate forward as a
|
||
starting point and are flagged for re-generation.
|
||
- **Favorites** (when migrated to hash-keyed) follow the path /
|
||
user intent.
|
||
|
||
The interesting case is the operator who keeps an unedited copy in
|
||
the archive library and edits the local copy: post-detection, the
|
||
archive copy stays on the original hash, the local copy branches to
|
||
the new hash, and the two histories cleanly split. Apollo's
|
||
`derived.db` cache will need an invalidation hook for the changed
|
||
hash — design it alongside Branch D.
|
||
|
||
### File Processing Pipeline
|
||
|
||
**Thumbnail Generation:**
|
||
1. Startup scan: Rayon parallel walk of BASE_PATH
|
||
2. Creates 200x200 thumbnails in THUMBNAILS directory (mirrors source structure)
|
||
3. Videos: extracts frame at 3-second mark via ffmpeg
|
||
4. Images: uses `image` crate for JPEG/PNG processing
|
||
5. RAW formats (NEF/CR2/ARW/DNG/etc.): the `image` crate can't decode RAW
|
||
pixel data, so the pipeline pulls an embedded JPEG preview instead. Fast
|
||
path is `exif::read_jpeg_at_ifd` against IFD0 (PRIMARY) and IFD1
|
||
(THUMBNAIL) — covers most older bodies and DNGs. Slow-path fallback shells
|
||
out to **`exiftool`** for `PreviewImage` / `JpgFromRaw` / `OtherImage`,
|
||
which reaches MakerNote / SubIFD-hosted previews kamadak-exif can't see
|
||
(e.g. Nikon's `PreviewIFD`, where modern Nikon bodies store the full-res
|
||
review JPEG). All candidates are pooled and the largest valid JPEG wins.
|
||
See `src/exif.rs::extract_embedded_jpeg_preview`.
|
||
|
||
**File Watching:**
|
||
Runs in background thread with two-tier strategy:
|
||
- **Quick scan** (default 60s): Recently modified files only
|
||
- **Full scan** (default 3600s): Comprehensive directory check
|
||
- Batch queries EXIF DB to detect new files
|
||
- Configurable via `WATCH_QUICK_INTERVAL_SECONDS` and `WATCH_FULL_INTERVAL_SECONDS`
|
||
|
||
**EXIF Extraction:**
|
||
- Uses `kamadak-exif` crate
|
||
- Supports: JPEG, TIFF, RAW (NEF, CR2, CR3), HEIF/HEIC, PNG, WebP
|
||
- Extracts: camera make/model, lens, dimensions, GPS coordinates, focal length, aperture, shutter speed, ISO, date taken
|
||
- Triggered on upload and during file watching
|
||
|
||
**File Upload Behavior:**
|
||
If file exists, appends timestamp to filename (`photo_1735124234.jpg`) to preserve history without overwrites.
|
||
|
||
### Authentication Flow
|
||
|
||
**Login:**
|
||
1. POST `/login` with username/password
|
||
2. Verify with `bcrypt::verify()` against password_hash
|
||
3. Generate JWT with claims: `{ sub: user_id, exp: 5_days_from_now }`
|
||
4. Sign with HS256 using `SECRET_KEY` environment variable
|
||
|
||
**Authorization:**
|
||
All protected endpoints extract `Claims` via `FromRequest` trait implementation. Token passed as `Authorization: Bearer <token>` header.
|
||
|
||
### API Structure
|
||
|
||
**Key Endpoint Patterns:**
|
||
|
||
```rust
|
||
// Image serving & upload
|
||
GET /image?path=...&size=...&format=...
|
||
POST /image (multipart file upload)
|
||
|
||
// Metadata & EXIF
|
||
GET /image/metadata?path=...
|
||
|
||
// Advanced search with filters
|
||
GET /photos?path=...&recursive=true&sort=DateTakenDesc&camera_make=Canon&gps_lat=...&gps_lon=...&gps_radius_km=10&date_from=...&date_to=...&tag_ids=1,2,3&media_type=Photo
|
||
|
||
// Video streaming (HLS)
|
||
POST /video/generate (creates .m3u8 playlist + .ts segments)
|
||
GET /video/stream?path=... (serves playlist)
|
||
|
||
// Tags
|
||
GET /image/tags/all
|
||
POST /image/tags (add tag to file)
|
||
DELETE /image/tags (remove tag from file)
|
||
POST /image/tags/batch (bulk tag updates)
|
||
|
||
// Memories (week-based grouping)
|
||
GET /memories?path=...&recursive=true
|
||
|
||
// AI Insights
|
||
POST /insights/generate (non-agentic single-shot)
|
||
POST /insights/generate/agentic (tool-calling loop; body: { file_path, backend?, model?, ... })
|
||
GET /insights?path=...&library=...
|
||
GET /insights/models (local Ollama models + capabilities)
|
||
GET /insights/openrouter/models (curated OpenRouter allowlist)
|
||
POST /insights/rate (thumbs up/down for training data)
|
||
|
||
// Insight Chat Continuation
|
||
POST /insights/chat (single-turn reply, non-streaming)
|
||
POST /insights/chat/stream (SSE: text / tool_call / tool_result / truncated / done)
|
||
GET /insights/chat/history?path=... (rendered transcript with tool invocations)
|
||
POST /insights/chat/rewind (truncate transcript at a rendered index)
|
||
```
|
||
|
||
**Request Types:**
|
||
- `FilesRequest`: Supports complex filtering (tags, EXIF fields, GPS radius, date ranges)
|
||
- `SortType`: Shuffle, NameAsc/Desc, TagCountAsc/Desc, DateTakenAsc/Desc
|
||
|
||
### Important Patterns
|
||
|
||
**Service Builder Pattern:**
|
||
Routes are registered via composable `ServiceBuilder` trait in `service.rs`. Allows modular feature addition.
|
||
|
||
**Path Validation:**
|
||
Always use `is_valid_full_path(&base_path, &requested_path, check_exists)` to prevent directory traversal attacks.
|
||
|
||
**File Type Detection:**
|
||
Centralized in `file_types.rs` with constants `IMAGE_EXTENSIONS` and `VIDEO_EXTENSIONS`. Provides both `Path` and `DirEntry` variants for performance.
|
||
|
||
**OpenTelemetry Tracing:**
|
||
All database operations and HTTP handlers wrapped in spans. In release builds, exports to OTLP endpoint via `OTLP_OTLS_ENDPOINT`. Debug builds use basic logger.
|
||
|
||
**Memory Exclusion:**
|
||
`PathExcluder` in `memories.rs` filters out directories from memories API via `EXCLUDED_DIRS` environment variable (comma-separated paths or substring patterns). The same excluder is applied to face-detection candidates (`face_watch::filter_excluded`) so junk directories like `@eaDir` / `.thumbnails` don't burn detect calls on Apollo.
|
||
|
||
### Face detection system
|
||
|
||
ImageApi owns the face data; Apollo (sibling repo) hosts the insightface inference service. Inference is triggered automatically by the file watcher and persisted into two tables:
|
||
|
||
- `persons(id, name UNIQUE COLLATE NOCASE, cover_face_id, entity_id, created_from_tag, notes, ...)` — operator-managed, name is the user-visible identity.
|
||
- `face_detections(id, library_id, content_hash, rel_path, bbox_*, embedding BLOB, confidence, source, person_id, status, model_version, ...)` — keyed on `content_hash` so a photo duplicated across libraries is detected once. Marker rows for `status IN ('no_faces','failed')` carry NULL bbox/embedding (CHECK constraint enforces this).
|
||
|
||
**Why content_hash and not (library_id, rel_path):** ties face data to the bytes, not the path. A backup mount that copies files from the primary library naturally inherits the existing detections without re-running inference. This is the reference implementation of the multi-library data model — see "Multi-library data model" above.
|
||
|
||
**File-watch hook** (`src/main.rs::process_new_files`): for each photo with a populated `content_hash`, check `FaceDao::already_scanned(hash)`; if not, send bytes (or embedded JPEG preview for RAW via `exif::extract_embedded_jpeg_preview`) to Apollo's `/api/internal/faces/detect`. K=`FACE_DETECT_CONCURRENCY` (default 8) parallel calls per scan tick; Apollo serializes them via its single-worker GPU pool. `face_watch.rs` is the Tokio orchestration layer.
|
||
|
||
**Per-tick backlog drain** (also `src/main.rs`): two passes that run on every watcher tick regardless of quick-vs-full scan:
|
||
- `backfill_unhashed_backlog` — populates `image_exif.content_hash` for photos that arrived before the hash field was retroactive. Capped by `FACE_HASH_BACKFILL_MAX_PER_TICK` (default 2000); errors don't burn the cap.
|
||
- `process_face_backlog` — runs detection on photos that have a hash but no `face_detections` row. Capped by `FACE_BACKLOG_MAX_PER_TICK` (default 64). Selected via a SQL anti-join (`FaceDao::list_unscanned_candidates`); videos and EXCLUDED_DIRS paths filtered out client-side via `face_watch::filter_excluded` so they never reach Apollo.
|
||
|
||
**Auto-bind on detection:** when a photo carries a tag whose name matches a `persons.name` (case-insensitive), the new face binds automatically iff cosine similarity to the person's existing-face mean is ≥ `FACE_AUTOBIND_MIN_COS` (default 0.4). Persons with no existing faces bind unconditionally and the new face becomes the cover.
|
||
|
||
**Manual face create** (`POST /image/faces`): crops the image to the user-supplied bbox, applies EXIF orientation via `exif::apply_orientation` (the `image` crate hands raw pre-rotation pixels — without this, manually-drawn bboxes never resolved a face on re-detection), pads to ~50% of bbox dims (RetinaFace anchor scales need ~50% face-fill at det_size=640), then calls Apollo's embed endpoint. A `force` flag lets the operator save a face the detector couldn't see (e.g. profile shots, occluded faces) — the row gets a zero-vector embedding so it's manually-bound only and won't participate in clustering.
|
||
|
||
**Rerun preserves manual rows** (`POST /image/faces/{id}/rerun`): only `source='auto'` rows are deleted before re-running detection. `already_scanned` returns true on ANY row, so a photo whose only faces are manually drawn never auto-redetects.
|
||
|
||
**Stats domain — content_hash, not file rows** (`FaceDao::stats` in `src/faces.rs`): `total_photos` counts `DISTINCT content_hash` over `image_exif` (filtered to image extensions, `content_hash IS NOT NULL`), and so do `scanned` / `with_faces` / `no_faces` / `failed` over `face_detections`. Numerator and denominator must live in the same domain — `face_detections` is keyed on content_hash, so the same JPEG present at two rel_paths or in two libraries scans once. Counting `image_exif` rows in the denominator inflated total by one per duplicate file and produced a permanent gap (e.g. 1101/1103 with nothing actually pending). Hash-less rows are excluded from total_photos while they sit in the `backfill_unhashed_backlog` queue; otherwise the bar pins below 100% for the duration of that backfill even though those rows aren't pending detection yet — they're pending hashing.
|
||
|
||
Module map:
|
||
- `src/faces.rs` — `FaceDao` trait + `SqliteFaceDao` impl, route handlers for `/faces/*`, `/image/faces/*`, `/persons/*`. Mirror of `tags.rs` layout.
|
||
- `src/face_watch.rs` — Tokio orchestration for the file-watch detect pass; `filter_excluded` (PathExcluder + image-extension filter), `read_image_bytes_for_detect` (RAW preview fallback).
|
||
- `src/ai/face_client.rs` — HTTP client for Apollo's inference. Configured by `APOLLO_FACE_API_BASE_URL`, falls back to `APOLLO_API_BASE_URL`. Both unset → feature disabled, file-watch hook is a no-op.
|
||
- `migrations/2026-04-29-000000_add_faces/` — schema.
|
||
|
||
### Startup Sequence
|
||
|
||
1. Load `.env` file
|
||
2. Run embedded Diesel migrations
|
||
3. Spawn file watcher thread
|
||
4. Create initial thumbnails (parallel scan)
|
||
5. Generate video GIF thumbnails
|
||
6. Initialize AppState with Actix actors
|
||
7. Set up Prometheus metrics (`imageserver_image_total`, `imageserver_video_total`)
|
||
8. Scan directory for videos and queue HLS processing
|
||
9. Start HTTP server on `BIND_URL` + localhost:8088
|
||
|
||
## Testing Patterns
|
||
|
||
Tests require `BASE_PATH` environment variable. Many integration tests create temporary directories and files.
|
||
|
||
When testing database code:
|
||
- Use in-memory SQLite: `DATABASE_URL=":memory:"`
|
||
- Run migrations in test setup
|
||
- Clean up with `DROP TABLE` or use `#[serial]` from `serial_test` crate if parallel tests conflict
|
||
|
||
## Common Gotchas
|
||
|
||
**EXIF Date Parsing:**
|
||
Multiple formats supported (EXIF DateTime, ISO8601, Unix timestamp). Fallback chain attempts multiple parsers.
|
||
|
||
**Video Processing:**
|
||
ffmpeg processes run asynchronously via actors. Use `StreamActor` to track completion. HLS segments written to `VIDEO_PATH`.
|
||
|
||
**File Extensions:**
|
||
Extension detection is case-insensitive. Use `file_types.rs` helpers rather than manual string matching.
|
||
|
||
**Migration Workflow:**
|
||
After creating a migration, manually edit the SQL, then regenerate `schema.rs` with `diesel print-schema`. Migrations auto-run on startup via `embedded_migrations!()` macro.
|
||
|
||
**Path Absolutization:**
|
||
Use `path-absolutize` crate's `.absolutize()` method when converting user-provided paths to ensure they're within `BASE_PATH`.
|
||
|
||
## Required Environment Variables
|
||
|
||
```bash
|
||
DATABASE_URL=./database.db # SQLite database path
|
||
BASE_PATH=/path/to/media # Root media directory
|
||
THUMBNAILS=/path/to/thumbnails # Thumbnail storage
|
||
VIDEO_PATH=/path/to/video/hls # HLS playlist output
|
||
GIFS_DIRECTORY=/path/to/gifs # Video GIF thumbnails
|
||
BIND_URL=0.0.0.0:8080 # Server binding
|
||
CORS_ALLOWED_ORIGINS=http://localhost:3000
|
||
SECRET_KEY=your-secret-key-here # JWT signing secret
|
||
RUST_LOG=info # Log level
|
||
EXCLUDED_DIRS=/private,/archive # Comma-separated paths to exclude from memories
|
||
```
|
||
|
||
Optional:
|
||
```bash
|
||
WATCH_QUICK_INTERVAL_SECONDS=60 # Quick scan interval
|
||
WATCH_FULL_INTERVAL_SECONDS=3600 # Full scan interval
|
||
OTLP_OTLS_ENDPOINT=http://... # OpenTelemetry collector (release builds)
|
||
|
||
# AI Insights Configuration
|
||
OLLAMA_PRIMARY_URL=http://desktop:11434 # Primary Ollama server (e.g., desktop)
|
||
OLLAMA_FALLBACK_URL=http://server:11434 # Fallback Ollama server (optional, always-on)
|
||
OLLAMA_PRIMARY_MODEL=nemotron-3-nano:30b # Model for primary server (default: nemotron-3-nano:30b)
|
||
OLLAMA_FALLBACK_MODEL=llama3.2:3b # Model for fallback server (optional, uses primary if not set)
|
||
OLLAMA_REQUEST_TIMEOUT_SECONDS=120 # Per-request generation timeout (default 120). Increase for slow CPU-offloaded models.
|
||
SMS_API_URL=http://localhost:8000 # SMS message API endpoint (default: localhost:8000)
|
||
SMS_API_TOKEN=your-api-token # SMS API authentication token (optional)
|
||
|
||
# Apollo Places integration (optional). When set, photo-insight enrichment
|
||
# folds the user's personal place name (Home, Work, Cabin, ...) into the
|
||
# location string fed to the LLM, and the agentic loop gains a
|
||
# `get_personal_place_at` tool. Unset = legacy Nominatim-only path.
|
||
APOLLO_API_BASE_URL=http://apollo.lan:8000 # Base URL of the sibling Apollo backend
|
||
|
||
# Face inference (optional). Apollo also hosts the insightface inference
|
||
# service; ImageApi calls it from the file-watch hook (Phase 3) and from
|
||
# the manual face-create endpoint. Falls back to APOLLO_API_BASE_URL when
|
||
# unset (typical single-Apollo deploy). Both unset = feature disabled.
|
||
APOLLO_FACE_API_BASE_URL=http://apollo.lan:8000 # Override if face service runs separately
|
||
FACE_AUTOBIND_MIN_COS=0.4 # Phase 3: cosine-sim floor for tag-name auto-bind
|
||
FACE_DETECT_CONCURRENCY=8 # Phase 3: per-scan-tick parallel detect calls
|
||
FACE_DETECT_TIMEOUT_SEC=60 # reqwest client timeout (CPU inference can be slow)
|
||
|
||
# OpenRouter (Hybrid Backend) - keeps embeddings + vision local, routes chat to OpenRouter
|
||
OPENROUTER_API_KEY=sk-or-... # Required to enable hybrid backend
|
||
OPENROUTER_DEFAULT_MODEL=anthropic/claude-sonnet-4 # Used when client doesn't pick a model
|
||
OPENROUTER_ALLOWED_MODELS=openai/gpt-4o-mini,anthropic/claude-haiku-4-5,google/gemini-2.5-flash
|
||
# Curated allowlist exposed to clients via
|
||
# GET /insights/openrouter/models. Empty = no picker.
|
||
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1 # Override base URL (optional)
|
||
OPENROUTER_EMBEDDING_MODEL=openai/text-embedding-3-small # Optional, embeddings stay local today
|
||
OPENROUTER_HTTP_REFERER=https://your-site.example # Optional attribution header
|
||
OPENROUTER_APP_TITLE=ImageApi # Optional attribution header
|
||
|
||
# Insight Chat Continuation
|
||
AGENTIC_CHAT_MAX_ITERATIONS=6 # Cap on tool-calling iterations per chat turn (default 6)
|
||
```
|
||
|
||
**AI Insights Fallback Behavior:**
|
||
- Primary server is tried first with its configured model (5-second connection timeout)
|
||
- On connection failure, automatically falls back to secondary server with its model (if configured)
|
||
- If `OLLAMA_FALLBACK_MODEL` not set, uses same model as primary server on fallback
|
||
- Total request timeout is 120 seconds to accommodate slow LLM inference
|
||
- Logs indicate which server and model was used (info level) and failover attempts (warn level)
|
||
- Backwards compatible: `OLLAMA_URL` and `OLLAMA_MODEL` still supported as fallbacks
|
||
|
||
**Model Discovery:**
|
||
The `OllamaClient` provides methods to query available models:
|
||
- `OllamaClient::list_models(url)` - Returns list of all models on a server
|
||
- `OllamaClient::is_model_available(url, model_name)` - Checks if a specific model exists
|
||
|
||
This allows runtime verification of model availability before generating insights.
|
||
|
||
**Hybrid Backend (OpenRouter):**
|
||
- Per-request opt-in via `backend=hybrid` on `POST /insights/generate/agentic`.
|
||
- Local Ollama still describes the image (vision); the description is inlined
|
||
into the chat prompt and the agentic loop runs on OpenRouter.
|
||
- `request.model` (if provided) overrides `OPENROUTER_DEFAULT_MODEL` for that
|
||
call. The mobile picker reads from `OPENROUTER_ALLOWED_MODELS`.
|
||
- No live capability precheck — the operator-curated allowlist is trusted.
|
||
A bad model id surfaces as a chat-call error.
|
||
- `GET /insights/openrouter/models` returns `{ models, default_model, configured }`
|
||
for client picker UIs.
|
||
|
||
**Insight Chat Continuation:**
|
||
|
||
After an agentic insight is generated, the full `Vec<ChatMessage>` transcript is
|
||
stored in `photo_insights.training_messages` and can be continued via the
|
||
chat endpoints. The `PhotoInsightResponse.has_training_messages` flag tells
|
||
clients whether chat is available for a given insight.
|
||
|
||
- `POST /insights/chat` runs one turn of the agentic loop against the replayed
|
||
history. Body: `{ file_path, library?, user_message, model?, backend?, num_ctx?,
|
||
temperature?, top_p?, top_k?, min_p?, max_iterations?, amend? }`.
|
||
- `POST /insights/chat/stream` is the SSE variant — same request body, response
|
||
is `text/event-stream` with events: `iteration_start`, `text` (delta), `tool_call`,
|
||
`tool_result`, `truncated`, `done`, plus a server-emitted `error_message` on
|
||
failure. Preferred by the mobile client for live tool-chip updates.
|
||
- `GET /insights/chat/history?path=...&library=...` returns the rendered
|
||
transcript. Each assistant message carries a `tools: [{name, arguments, result,
|
||
result_truncated?}]` array with the tool invocations that led up to it. Tool
|
||
results over 2000 chars are truncated with `result_truncated: true`.
|
||
- `POST /insights/chat/rewind` truncates the transcript at a given rendered
|
||
index (drops that message + any tool-call scaffolding that preceded it + all
|
||
later turns). Index 0 is protected. Used for "try again from here" flows.
|
||
|
||
Backend routing rules (matches agentic-insight generation):
|
||
- Stored `backend` on the insight row is authoritative by default.
|
||
- `request.backend` may override per-turn. `local -> hybrid` is rejected in
|
||
v1 (would require on-the-fly visual-description rewrite); `hybrid -> local`
|
||
replays verbatim since the description is already inlined as text.
|
||
- `request.model` overrides the chat model (an Ollama id in local mode, an
|
||
OpenRouter id in hybrid mode).
|
||
|
||
Persistence:
|
||
- Append mode (default): re-serialize the full history and `UPDATE` the same
|
||
row's `training_messages`.
|
||
- Amend mode (`amend: true`): regenerate the title, insert a new insight row
|
||
via `store_insight` (auto-flips prior rows' `is_current=false`). Response
|
||
surfaces the new row's id as `amended_insight_id`.
|
||
|
||
Per-`(library_id, file_path)` async mutex (`AppState.insight_chat.chat_locks`)
|
||
serialises concurrent turns on the same insight so the JSON blob doesn't race.
|
||
|
||
Context management is a soft bound: if the serialized history exceeds
|
||
`num_ctx - 2048` tokens (cheap 4-byte/token heuristic), the oldest
|
||
assistant-tool_call + tool_result pairs are dropped until under budget. The
|
||
initial user message (with any images) and system prompt are always preserved.
|
||
The `truncated` event / flag is surfaced to the client when a drop occurred.
|
||
|
||
Configurable env:
|
||
- `AGENTIC_CHAT_MAX_ITERATIONS` — cap on tool-calling iterations per turn
|
||
(default 6). Per-request `max_iterations` is clamped to this cap.
|
||
|
||
**Apollo Places integration (optional):**
|
||
|
||
The sibling Apollo project (personal location-history viewer) owns
|
||
user-defined Places: `name + lat/lon + radius_m + description (+ optional
|
||
category)`. When `APOLLO_API_BASE_URL` is set, ImageApi queries
|
||
`/api/places/contains?lat=&lon=` to enrich the LLM prompt's location
|
||
string. See `src/ai/apollo_client.rs` and `src/ai/insight_generator.rs`:
|
||
|
||
- **Auto-enrichment** (always on when configured): the per-photo location
|
||
resolver folds the most-specific containing Place ("Home — near
|
||
Cambridge, MA" or "Home (My house in Cambridge) — near Cambridge, MA"
|
||
when a description is set) into the location field of `combine_contexts`.
|
||
Smallest-radius wins — Apollo sorts server-side, this code takes `[0]`.
|
||
- **Agentic tool** `get_personal_place_at(latitude, longitude)`: registered
|
||
alongside `reverse_geocode` only when `apollo_enabled()` returns true.
|
||
Returns "- Name [category]: description (radius N m)" lines, smallest
|
||
radius first. The tool is **deliberately narrow** — no enumerate-all
|
||
variant; auto-enrichment covers the photo-context path and the agentic
|
||
tool covers ad-hoc lat/lon questions in chat continuation.
|
||
|
||
Failure modes degrade silently to the legacy Nominatim path: 5 s timeout,
|
||
errors logged at `warn`, empty results returned. Apollo's routes are
|
||
unauthenticated (single-user, LAN-trust); add JWT auth here + on Apollo's
|
||
side if exposing beyond a trusted network.
|
||
|
||
## Dependencies of Note
|
||
|
||
### Rust crates
|
||
|
||
- **actix-web**: HTTP framework
|
||
- **diesel**: ORM for SQLite
|
||
- **jsonwebtoken**: JWT implementation
|
||
- **kamadak-exif**: EXIF parsing
|
||
- **image**: Thumbnail generation
|
||
- **walkdir**: Directory traversal
|
||
- **rayon**: Parallel processing
|
||
- **opentelemetry**: Distributed tracing
|
||
- **bcrypt**: Password hashing
|
||
- **infer**: Magic number file type detection
|
||
|
||
### External binaries (must be on `PATH`)
|
||
|
||
- **`ffmpeg`** — video thumbnail extraction (`StreamActor`, HLS pipeline) and
|
||
the HEIF/HEIC/NEF/ARW thumbnail fallback in `generate_image_thumbnail_ffmpeg`.
|
||
Required for any deploy that holds video or HEIF files.
|
||
- **`exiftool`** — optional but strongly recommended for RAW-heavy libraries.
|
||
The thumbnail pipeline shells out to it as the slow-path fallback for
|
||
embedded preview extraction (Nikon MakerNote `PreviewIFD`, Canon SubIFDs,
|
||
etc. — anything kamadak-exif's IFD0/IFD1 readers can't reach). Without
|
||
exiftool installed, RAWs whose preview lives outside IFD0/IFD1 will fall
|
||
through to ffmpeg, which often produces black thumbnails. Install via
|
||
package manager: `apt install libimage-exiftool-perl`,
|
||
`brew install exiftool`, `winget install OliverBetz.ExifTool`, or
|
||
`choco install exiftool`.
|