kamadak-exif's In::PRIMARY / In::THUMBNAIL only address IFD0 and IFD1. On modern Nikon NEFs the full-res review JPEG lives in the MakerNote's PreviewIFD (and many Canon CR2s / DNGs put theirs in a SubIFD chain) — both unreachable through the existing reader, so the previous patch still produced no preview for those files and the pipeline fell through to ffmpeg, which writes black frames when it can't decode the RAW. Add a slow-path layer in extract_embedded_jpeg_preview that shells out to exiftool for PreviewImage / JpgFromRaw / OtherImage (one process per tag). All candidates from both layers are pooled and the largest valid JPEG wins. exiftool not on PATH degrades to fast-path-only behavior rather than breaking — the fallback is a strict superset. Documented the new optional dependency in README.md and CLAUDE.md with install commands for apt / brew / winget / choco. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
403 lines
17 KiB
Markdown
403 lines
17 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Project Overview
|
|
|
|
An Actix-web REST API for serving images and videos from a filesystem with automatic thumbnail generation, EXIF extraction, tag organization, and a memories feature for browsing photos by date. Uses SQLite/Diesel ORM for data persistence and ffmpeg for video processing.
|
|
|
|
## Development Commands
|
|
|
|
### Building & Running
|
|
```bash
|
|
# Build for development
|
|
cargo build
|
|
|
|
# Build for release (uses thin LTO optimization)
|
|
cargo build --release
|
|
|
|
# Run the server (requires .env file with DATABASE_URL, BASE_PATH, THUMBNAILS, VIDEO_PATH, BIND_URL, SECRET_KEY)
|
|
cargo run
|
|
|
|
# Run with specific log level
|
|
RUST_LOG=debug cargo run
|
|
```
|
|
|
|
### Testing
|
|
```bash
|
|
# Run all tests (requires BASE_PATH in .env)
|
|
cargo test
|
|
|
|
# Run specific test
|
|
cargo test test_name
|
|
|
|
# Run tests with output
|
|
cargo test -- --nocapture
|
|
```
|
|
|
|
### Database Migrations
|
|
```bash
|
|
# Install diesel CLI (one-time setup)
|
|
cargo install diesel_cli --no-default-features --features sqlite
|
|
|
|
# Create new migration
|
|
diesel migration generate migration_name
|
|
|
|
# Run migrations (also runs automatically on app startup)
|
|
diesel migration run
|
|
|
|
# Revert last migration
|
|
diesel migration revert
|
|
|
|
# Regenerate schema.rs after manual migration changes
|
|
diesel print-schema > src/database/schema.rs
|
|
```
|
|
|
|
### Code Quality
|
|
```bash
|
|
# Format code
|
|
cargo fmt
|
|
|
|
# Run clippy linter
|
|
cargo clippy
|
|
|
|
# Fix automatically fixable issues
|
|
cargo fix
|
|
```
|
|
|
|
### Utility Binaries
|
|
```bash
|
|
# Two-phase cleanup: resolve missing files and validate file types
|
|
cargo run --bin cleanup_files -- --base-path /path/to/media --database-url ./database.db
|
|
```
|
|
|
|
## Architecture Overview
|
|
|
|
### Core Components
|
|
|
|
**Layered Architecture:**
|
|
- **HTTP Layer** (`main.rs`): Route handlers for images, videos, metadata, tags, favorites, memories
|
|
- **Auth Layer** (`auth.rs`): JWT token validation, Claims extraction via FromRequest trait
|
|
- **Service Layer** (`files.rs`, `exif.rs`, `memories.rs`): Business logic for file operations and EXIF extraction
|
|
- **DAO Layer** (`database/mod.rs`): Trait-based data access (ExifDao, UserDao, FavoriteDao, TagDao)
|
|
- **Database Layer**: Diesel ORM with SQLite, schema in `database/schema.rs`
|
|
|
|
**Async Actor System (Actix):**
|
|
- `StreamActor`: Manages ffmpeg video processing lifecycle
|
|
- `VideoPlaylistManager`: Scans directories and queues videos
|
|
- `PlaylistGenerator`: Creates HLS playlists for video streaming
|
|
|
|
### Database Schema & Patterns
|
|
|
|
**Tables:**
|
|
- `users`: Authentication (id, username, password_hash)
|
|
- `favorites`: User-specific favorites (userid, path)
|
|
- `tags`: Custom labels with timestamps
|
|
- `tagged_photo`: Many-to-many photo-tag relationships
|
|
- `image_exif`: Rich metadata (file_path + 16 EXIF fields: camera, GPS, dates, exposure settings)
|
|
|
|
**DAO Pattern:**
|
|
All database access goes through trait-based DAOs (e.g., `ExifDao`, `SqliteExifDao`). Connection pooling uses `Arc<Mutex<SqliteConnection>>`. All DB operations are traced with OpenTelemetry in release builds.
|
|
|
|
**Key DAO Methods:**
|
|
- `store_exif()`, `get_exif()`, `get_exif_batch()`: EXIF CRUD operations
|
|
- `query_by_exif()`: Complex filtering by camera, GPS bounds, date ranges
|
|
- Batch operations minimize DB hits during file watching
|
|
|
|
### File Processing Pipeline
|
|
|
|
**Thumbnail Generation:**
|
|
1. Startup scan: Rayon parallel walk of BASE_PATH
|
|
2. Creates 200x200 thumbnails in THUMBNAILS directory (mirrors source structure)
|
|
3. Videos: extracts frame at 3-second mark via ffmpeg
|
|
4. Images: uses `image` crate for JPEG/PNG processing
|
|
5. RAW formats (NEF/CR2/ARW/DNG/etc.): the `image` crate can't decode RAW
|
|
pixel data, so the pipeline pulls an embedded JPEG preview instead. Fast
|
|
path is `exif::read_jpeg_at_ifd` against IFD0 (PRIMARY) and IFD1
|
|
(THUMBNAIL) — covers most older bodies and DNGs. Slow-path fallback shells
|
|
out to **`exiftool`** for `PreviewImage` / `JpgFromRaw` / `OtherImage`,
|
|
which reaches MakerNote / SubIFD-hosted previews kamadak-exif can't see
|
|
(e.g. Nikon's `PreviewIFD`, where modern Nikon bodies store the full-res
|
|
review JPEG). All candidates are pooled and the largest valid JPEG wins.
|
|
See `src/exif.rs::extract_embedded_jpeg_preview`.
|
|
|
|
**File Watching:**
|
|
Runs in background thread with two-tier strategy:
|
|
- **Quick scan** (default 60s): Recently modified files only
|
|
- **Full scan** (default 3600s): Comprehensive directory check
|
|
- Batch queries EXIF DB to detect new files
|
|
- Configurable via `WATCH_QUICK_INTERVAL_SECONDS` and `WATCH_FULL_INTERVAL_SECONDS`
|
|
|
|
**EXIF Extraction:**
|
|
- Uses `kamadak-exif` crate
|
|
- Supports: JPEG, TIFF, RAW (NEF, CR2, CR3), HEIF/HEIC, PNG, WebP
|
|
- Extracts: camera make/model, lens, dimensions, GPS coordinates, focal length, aperture, shutter speed, ISO, date taken
|
|
- Triggered on upload and during file watching
|
|
|
|
**File Upload Behavior:**
|
|
If file exists, appends timestamp to filename (`photo_1735124234.jpg`) to preserve history without overwrites.
|
|
|
|
### Authentication Flow
|
|
|
|
**Login:**
|
|
1. POST `/login` with username/password
|
|
2. Verify with `bcrypt::verify()` against password_hash
|
|
3. Generate JWT with claims: `{ sub: user_id, exp: 5_days_from_now }`
|
|
4. Sign with HS256 using `SECRET_KEY` environment variable
|
|
|
|
**Authorization:**
|
|
All protected endpoints extract `Claims` via `FromRequest` trait implementation. Token passed as `Authorization: Bearer <token>` header.
|
|
|
|
### API Structure
|
|
|
|
**Key Endpoint Patterns:**
|
|
|
|
```rust
|
|
// Image serving & upload
|
|
GET /image?path=...&size=...&format=...
|
|
POST /image (multipart file upload)
|
|
|
|
// Metadata & EXIF
|
|
GET /image/metadata?path=...
|
|
|
|
// Advanced search with filters
|
|
GET /photos?path=...&recursive=true&sort=DateTakenDesc&camera_make=Canon&gps_lat=...&gps_lon=...&gps_radius_km=10&date_from=...&date_to=...&tag_ids=1,2,3&media_type=Photo
|
|
|
|
// Video streaming (HLS)
|
|
POST /video/generate (creates .m3u8 playlist + .ts segments)
|
|
GET /video/stream?path=... (serves playlist)
|
|
|
|
// Tags
|
|
GET /image/tags/all
|
|
POST /image/tags (add tag to file)
|
|
DELETE /image/tags (remove tag from file)
|
|
POST /image/tags/batch (bulk tag updates)
|
|
|
|
// Memories (week-based grouping)
|
|
GET /memories?path=...&recursive=true
|
|
|
|
// AI Insights
|
|
POST /insights/generate (non-agentic single-shot)
|
|
POST /insights/generate/agentic (tool-calling loop; body: { file_path, backend?, model?, ... })
|
|
GET /insights?path=...&library=...
|
|
GET /insights/models (local Ollama models + capabilities)
|
|
GET /insights/openrouter/models (curated OpenRouter allowlist)
|
|
POST /insights/rate (thumbs up/down for training data)
|
|
|
|
// Insight Chat Continuation
|
|
POST /insights/chat (single-turn reply, non-streaming)
|
|
POST /insights/chat/stream (SSE: text / tool_call / tool_result / truncated / done)
|
|
GET /insights/chat/history?path=... (rendered transcript with tool invocations)
|
|
POST /insights/chat/rewind (truncate transcript at a rendered index)
|
|
```
|
|
|
|
**Request Types:**
|
|
- `FilesRequest`: Supports complex filtering (tags, EXIF fields, GPS radius, date ranges)
|
|
- `SortType`: Shuffle, NameAsc/Desc, TagCountAsc/Desc, DateTakenAsc/Desc
|
|
|
|
### Important Patterns
|
|
|
|
**Service Builder Pattern:**
|
|
Routes are registered via composable `ServiceBuilder` trait in `service.rs`. Allows modular feature addition.
|
|
|
|
**Path Validation:**
|
|
Always use `is_valid_full_path(&base_path, &requested_path, check_exists)` to prevent directory traversal attacks.
|
|
|
|
**File Type Detection:**
|
|
Centralized in `file_types.rs` with constants `IMAGE_EXTENSIONS` and `VIDEO_EXTENSIONS`. Provides both `Path` and `DirEntry` variants for performance.
|
|
|
|
**OpenTelemetry Tracing:**
|
|
All database operations and HTTP handlers wrapped in spans. In release builds, exports to OTLP endpoint via `OTLP_OTLS_ENDPOINT`. Debug builds use basic logger.
|
|
|
|
**Memory Exclusion:**
|
|
`PathExcluder` in `memories.rs` filters out directories from memories API via `EXCLUDED_DIRS` environment variable (comma-separated paths or substring patterns).
|
|
|
|
### Startup Sequence
|
|
|
|
1. Load `.env` file
|
|
2. Run embedded Diesel migrations
|
|
3. Spawn file watcher thread
|
|
4. Create initial thumbnails (parallel scan)
|
|
5. Generate video GIF thumbnails
|
|
6. Initialize AppState with Actix actors
|
|
7. Set up Prometheus metrics (`imageserver_image_total`, `imageserver_video_total`)
|
|
8. Scan directory for videos and queue HLS processing
|
|
9. Start HTTP server on `BIND_URL` + localhost:8088
|
|
|
|
## Testing Patterns
|
|
|
|
Tests require `BASE_PATH` environment variable. Many integration tests create temporary directories and files.
|
|
|
|
When testing database code:
|
|
- Use in-memory SQLite: `DATABASE_URL=":memory:"`
|
|
- Run migrations in test setup
|
|
- Clean up with `DROP TABLE` or use `#[serial]` from `serial_test` crate if parallel tests conflict
|
|
|
|
## Common Gotchas
|
|
|
|
**EXIF Date Parsing:**
|
|
Multiple formats supported (EXIF DateTime, ISO8601, Unix timestamp). Fallback chain attempts multiple parsers.
|
|
|
|
**Video Processing:**
|
|
ffmpeg processes run asynchronously via actors. Use `StreamActor` to track completion. HLS segments written to `VIDEO_PATH`.
|
|
|
|
**File Extensions:**
|
|
Extension detection is case-insensitive. Use `file_types.rs` helpers rather than manual string matching.
|
|
|
|
**Migration Workflow:**
|
|
After creating a migration, manually edit the SQL, then regenerate `schema.rs` with `diesel print-schema`. Migrations auto-run on startup via `embedded_migrations!()` macro.
|
|
|
|
**Path Absolutization:**
|
|
Use `path-absolutize` crate's `.absolutize()` method when converting user-provided paths to ensure they're within `BASE_PATH`.
|
|
|
|
## Required Environment Variables
|
|
|
|
```bash
|
|
DATABASE_URL=./database.db # SQLite database path
|
|
BASE_PATH=/path/to/media # Root media directory
|
|
THUMBNAILS=/path/to/thumbnails # Thumbnail storage
|
|
VIDEO_PATH=/path/to/video/hls # HLS playlist output
|
|
GIFS_DIRECTORY=/path/to/gifs # Video GIF thumbnails
|
|
BIND_URL=0.0.0.0:8080 # Server binding
|
|
CORS_ALLOWED_ORIGINS=http://localhost:3000
|
|
SECRET_KEY=your-secret-key-here # JWT signing secret
|
|
RUST_LOG=info # Log level
|
|
EXCLUDED_DIRS=/private,/archive # Comma-separated paths to exclude from memories
|
|
```
|
|
|
|
Optional:
|
|
```bash
|
|
WATCH_QUICK_INTERVAL_SECONDS=60 # Quick scan interval
|
|
WATCH_FULL_INTERVAL_SECONDS=3600 # Full scan interval
|
|
OTLP_OTLS_ENDPOINT=http://... # OpenTelemetry collector (release builds)
|
|
|
|
# AI Insights Configuration
|
|
OLLAMA_PRIMARY_URL=http://desktop:11434 # Primary Ollama server (e.g., desktop)
|
|
OLLAMA_FALLBACK_URL=http://server:11434 # Fallback Ollama server (optional, always-on)
|
|
OLLAMA_PRIMARY_MODEL=nemotron-3-nano:30b # Model for primary server (default: nemotron-3-nano:30b)
|
|
OLLAMA_FALLBACK_MODEL=llama3.2:3b # Model for fallback server (optional, uses primary if not set)
|
|
OLLAMA_REQUEST_TIMEOUT_SECONDS=120 # Per-request generation timeout (default 120). Increase for slow CPU-offloaded models.
|
|
SMS_API_URL=http://localhost:8000 # SMS message API endpoint (default: localhost:8000)
|
|
SMS_API_TOKEN=your-api-token # SMS API authentication token (optional)
|
|
|
|
# OpenRouter (Hybrid Backend) - keeps embeddings + vision local, routes chat to OpenRouter
|
|
OPENROUTER_API_KEY=sk-or-... # Required to enable hybrid backend
|
|
OPENROUTER_DEFAULT_MODEL=anthropic/claude-sonnet-4 # Used when client doesn't pick a model
|
|
OPENROUTER_ALLOWED_MODELS=openai/gpt-4o-mini,anthropic/claude-haiku-4-5,google/gemini-2.5-flash
|
|
# Curated allowlist exposed to clients via
|
|
# GET /insights/openrouter/models. Empty = no picker.
|
|
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1 # Override base URL (optional)
|
|
OPENROUTER_EMBEDDING_MODEL=openai/text-embedding-3-small # Optional, embeddings stay local today
|
|
OPENROUTER_HTTP_REFERER=https://your-site.example # Optional attribution header
|
|
OPENROUTER_APP_TITLE=ImageApi # Optional attribution header
|
|
|
|
# Insight Chat Continuation
|
|
AGENTIC_CHAT_MAX_ITERATIONS=6 # Cap on tool-calling iterations per chat turn (default 6)
|
|
```
|
|
|
|
**AI Insights Fallback Behavior:**
|
|
- Primary server is tried first with its configured model (5-second connection timeout)
|
|
- On connection failure, automatically falls back to secondary server with its model (if configured)
|
|
- If `OLLAMA_FALLBACK_MODEL` not set, uses same model as primary server on fallback
|
|
- Total request timeout is 120 seconds to accommodate slow LLM inference
|
|
- Logs indicate which server and model was used (info level) and failover attempts (warn level)
|
|
- Backwards compatible: `OLLAMA_URL` and `OLLAMA_MODEL` still supported as fallbacks
|
|
|
|
**Model Discovery:**
|
|
The `OllamaClient` provides methods to query available models:
|
|
- `OllamaClient::list_models(url)` - Returns list of all models on a server
|
|
- `OllamaClient::is_model_available(url, model_name)` - Checks if a specific model exists
|
|
|
|
This allows runtime verification of model availability before generating insights.
|
|
|
|
**Hybrid Backend (OpenRouter):**
|
|
- Per-request opt-in via `backend=hybrid` on `POST /insights/generate/agentic`.
|
|
- Local Ollama still describes the image (vision); the description is inlined
|
|
into the chat prompt and the agentic loop runs on OpenRouter.
|
|
- `request.model` (if provided) overrides `OPENROUTER_DEFAULT_MODEL` for that
|
|
call. The mobile picker reads from `OPENROUTER_ALLOWED_MODELS`.
|
|
- No live capability precheck — the operator-curated allowlist is trusted.
|
|
A bad model id surfaces as a chat-call error.
|
|
- `GET /insights/openrouter/models` returns `{ models, default_model, configured }`
|
|
for client picker UIs.
|
|
|
|
**Insight Chat Continuation:**
|
|
|
|
After an agentic insight is generated, the full `Vec<ChatMessage>` transcript is
|
|
stored in `photo_insights.training_messages` and can be continued via the
|
|
chat endpoints. The `PhotoInsightResponse.has_training_messages` flag tells
|
|
clients whether chat is available for a given insight.
|
|
|
|
- `POST /insights/chat` runs one turn of the agentic loop against the replayed
|
|
history. Body: `{ file_path, library?, user_message, model?, backend?, num_ctx?,
|
|
temperature?, top_p?, top_k?, min_p?, max_iterations?, amend? }`.
|
|
- `POST /insights/chat/stream` is the SSE variant — same request body, response
|
|
is `text/event-stream` with events: `iteration_start`, `text` (delta), `tool_call`,
|
|
`tool_result`, `truncated`, `done`, plus a server-emitted `error_message` on
|
|
failure. Preferred by the mobile client for live tool-chip updates.
|
|
- `GET /insights/chat/history?path=...&library=...` returns the rendered
|
|
transcript. Each assistant message carries a `tools: [{name, arguments, result,
|
|
result_truncated?}]` array with the tool invocations that led up to it. Tool
|
|
results over 2000 chars are truncated with `result_truncated: true`.
|
|
- `POST /insights/chat/rewind` truncates the transcript at a given rendered
|
|
index (drops that message + any tool-call scaffolding that preceded it + all
|
|
later turns). Index 0 is protected. Used for "try again from here" flows.
|
|
|
|
Backend routing rules (matches agentic-insight generation):
|
|
- Stored `backend` on the insight row is authoritative by default.
|
|
- `request.backend` may override per-turn. `local -> hybrid` is rejected in
|
|
v1 (would require on-the-fly visual-description rewrite); `hybrid -> local`
|
|
replays verbatim since the description is already inlined as text.
|
|
- `request.model` overrides the chat model (an Ollama id in local mode, an
|
|
OpenRouter id in hybrid mode).
|
|
|
|
Persistence:
|
|
- Append mode (default): re-serialize the full history and `UPDATE` the same
|
|
row's `training_messages`.
|
|
- Amend mode (`amend: true`): regenerate the title, insert a new insight row
|
|
via `store_insight` (auto-flips prior rows' `is_current=false`). Response
|
|
surfaces the new row's id as `amended_insight_id`.
|
|
|
|
Per-`(library_id, file_path)` async mutex (`AppState.insight_chat.chat_locks`)
|
|
serialises concurrent turns on the same insight so the JSON blob doesn't race.
|
|
|
|
Context management is a soft bound: if the serialized history exceeds
|
|
`num_ctx - 2048` tokens (cheap 4-byte/token heuristic), the oldest
|
|
assistant-tool_call + tool_result pairs are dropped until under budget. The
|
|
initial user message (with any images) and system prompt are always preserved.
|
|
The `truncated` event / flag is surfaced to the client when a drop occurred.
|
|
|
|
Configurable env:
|
|
- `AGENTIC_CHAT_MAX_ITERATIONS` — cap on tool-calling iterations per turn
|
|
(default 6). Per-request `max_iterations` is clamped to this cap.
|
|
|
|
## Dependencies of Note
|
|
|
|
### Rust crates
|
|
|
|
- **actix-web**: HTTP framework
|
|
- **diesel**: ORM for SQLite
|
|
- **jsonwebtoken**: JWT implementation
|
|
- **kamadak-exif**: EXIF parsing
|
|
- **image**: Thumbnail generation
|
|
- **walkdir**: Directory traversal
|
|
- **rayon**: Parallel processing
|
|
- **opentelemetry**: Distributed tracing
|
|
- **bcrypt**: Password hashing
|
|
- **infer**: Magic number file type detection
|
|
|
|
### External binaries (must be on `PATH`)
|
|
|
|
- **`ffmpeg`** — video thumbnail extraction (`StreamActor`, HLS pipeline) and
|
|
the HEIF/HEIC/NEF/ARW thumbnail fallback in `generate_image_thumbnail_ffmpeg`.
|
|
Required for any deploy that holds video or HEIF files.
|
|
- **`exiftool`** — optional but strongly recommended for RAW-heavy libraries.
|
|
The thumbnail pipeline shells out to it as the slow-path fallback for
|
|
embedded preview extraction (Nikon MakerNote `PreviewIFD`, Canon SubIFDs,
|
|
etc. — anything kamadak-exif's IFD0/IFD1 readers can't reach). Without
|
|
exiftool installed, RAWs whose preview lives outside IFD0/IFD1 will fall
|
|
through to ffmpeg, which often produces black thumbnails. Install via
|
|
package manager: `apt install libimage-exiftool-perl`,
|
|
`brew install exiftool`, `winget install OliverBetz.ExifTool`, or
|
|
`choco install exiftool`.
|