Three recurring issues on every full scan:
1. Video playlist scans re-enqueued every file only to reject it as
AlreadyExists. Pre-filter in ScanDirectoryMessage and QueueVideosMessage
so we skip videos whose .m3u8 already exists, and demote the leaked
AlreadyExists log to debug.
2. image crate was built with only jpeg/png features, so webp/tiff/avif
files logged "format not supported" every scan. Enable those features.
3. RAW (ARW/NEF/CR2/...) and HEIC thumbnails weren't generated, so the
scan kept retrying them. Try the file's embedded JPEG preview via
kamadak-exif first (fast, pure-Rust, works on Sony ARW where ffmpeg's
TIFF decoder fails). Fall back to ffmpeg for HEIC/HEIF and RAWs with
no preview. Anything still undecodable gets a <thumb>.unsupported
sentinel so future scans skip it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add LlmClient::chat_with_tools_stream and SSE endpoint
POST /insights/chat/stream that emits text deltas, tool_call /
tool_result pairs, truncated notice, and a terminal done frame as the
agentic loop runs.
- Ollama: parses NDJSON from /api/chat stream, accumulates content
deltas, emits Done with tool_calls from the final chunk.
- OpenRouter: parses OpenAI-compatible SSE, reassembles tool_call
argument deltas by index, asks for stream_options.include_usage.
- InsightChatService spawns the loop on a tokio task, feeds events
through an mpsc channel, persists training_messages at the end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Preparation for a second LLM backend (OpenRouter) and hybrid vision-local /
chat-remote mode. Shared wire types (ChatMessage, Tool, ToolCall, etc.) move
into a new src/ai/llm_client.rs and are re-exported from ollama.rs so
existing imports keep working. OllamaClient now implements LlmClient.
No behavior change; callers still hold the concrete OllamaClient. Caller
migration to Arc<dyn LlmClient> is deferred to the PR that wires hybrid
backend routing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds blake3 content hashing as the basis for derivative dedup
(thumbnails, HLS) across libraries. Computed inline by the watcher on
ingest and by a new `backfill_hashes` binary for historical rows.
Key changes:
- `content_hash` and `size_bytes` are now populated on new image_exif
rows; a new ExifDao surface (`get_rows_missing_hash`,
`backfill_content_hash`, `find_by_content_hash`) supports backfill and
future hash-keyed lookups.
- The watcher now registers every image/video in image_exif, not just
files with parseable EXIF. EXIF becomes optional enrichment; videos
and other non-EXIF files still get a hashed row. This also makes
DB-indexed sort/filter cover the full library.
- `/image` thumbnail serve dual-looks up hash-keyed path first, then
falls back to the legacy mirrored layout.
- Upload flow accepts `?library=` query param + hashes uploaded files.
- Store_exif logs the underlying Diesel error on insert failure so
constraint violations surface instead of hiding behind a generic
InsertError.
- New migration normalizes rel_path separators to forward slash across
all tables, deduplicating any rows that collide after normalization.
Fixes spurious UNIQUE violations from mixed backslash/forward-slash
paths on Windows ingest.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implements Phase 1 & 2 of Google Takeout RAG integration:
- Database migrations for calendar_events, location_history, search_history
- DAO implementations with hybrid time + semantic search
- Parsers for .ics, JSON, and HTML Google Takeout formats
- Import utilities with batch insert optimization
Features:
- CalendarEventDao: Hybrid time-range + semantic search for events
- LocationHistoryDao: GPS proximity with Haversine distance calculation
- SearchHistoryDao: Semantic-first search (queries are embedding-rich)
- Batch inserts for performance (1M+ records in minutes vs hours)
- OpenTelemetry tracing for all database operations
Import utilities:
- import_calendar: Parse .ics with optional embedding generation
- import_location_history: High-volume GPS data with batch inserts
- import_search_history: Always generates embeddings for semantic search
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit addresses several security vulnerabilities in the authentication
and authorization system:
1. JWT Encoding Panic Fix (Critical)
- Replace .unwrap() with proper error handling in JWT token generation
- Prevents server crashes from encoding failures
- Returns HTTP 500 with error logging instead of panicking
2. Rate Limiting for Login Endpoint (Critical)
- Add actix-governor dependency (v0.5)
- Configure rate limiter: 2 requests/sec with burst of 5
- Protects against brute-force authentication attacks
3. Strengthen Password Requirements
- Minimum length increased from 6 to 12 characters
- Require uppercase, lowercase, numeric, and special characters
- Add comprehensive validation with clear error messages
4. Fix Token Parsing Vulnerability
- Replace unsafe split().last().unwrap_or() pattern
- Use strip_prefix() for proper Bearer token validation
- Return InvalidToken error for malformed Authorization headers
5. Improve Authentication Logging
- Sanitize error messages to avoid leaking user existence
- Change from "User not found or incorrect password" to "Failed login attempt"
All changes tested and verified with existing test suite (65/65 tests passing).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Switch from fat LTO to thin LTO for faster release builds while maintaining similar performance characteristics.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>