Commit Graph

433 Commits

Author SHA1 Message Date
Cameron
0e55a6b125 fix(ai): treat rewind at end of history as no-op success
The mobile client's regenerate-after-failure flow sends a discard index
equal to the server's rendered count (its optimistic user bubble for the
failed turn was never persisted). find_raw_cut treated this as out of
range, surfacing as "Chat rewind failed: discard_from_rendered_index out
of range" and blocking the retry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 19:12:17 -04:00
Cameron
0ebc2e9003 feat(ai): rerank timing + think:false + OpenRouter error detail
- search_rag reranker now logs wall-clock time around the ollama.generate
  call, the candidate count / top-N going in, and the final reordering.
  The "final indices" + swap-count line is info level so it's always
  visible; detailed before/after previews stay at debug for when you want
  to inspect reranker quality.
- New OllamaClient::generate_no_think convenience that sets Ollama's
  top-level think:false on the request, plumbed through try_generate via
  a new internal generate_with_options. Used only by the reranker today;
  avoids the chain-of-thought tax on reasoning models (Qwen3/VL,
  DeepSeek-R1 distills, GPT-OSS) when the task has nothing to reason
  about. Server-side no-op on non-reasoning models.
- OpenRouter chat_with_tools "missing choices[0]" error now includes the
  actual response body — extracts structured {error: {code, message}}
  when OpenRouter surfaces it (common for upstream-provider issues like
  rate limits and content moderation), otherwise falls back to a
  truncated raw-JSON view.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 16:19:45 -04:00
Cameron
e5781325c6 fix(ai): render tool-call arguments as compact JSON in logs
Switch the "Agentic tool call" log from {:?} (Debug) to {} (Display) on
serde_json::Value. Display produces compact JSON — `{"date":"2023-08-15"}`
instead of `Object {"date": String("2023-08-15")}` — which is what the
model actually sent and what a human reading the log wants to see.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:25:53 -04:00
Cameron
d43f5fc991 docs: document OLLAMA_REQUEST_TIMEOUT_SECONDS env var
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:54:23 -04:00
Cameron
f0ae9f95dc feat(ai): few-shot exemplars + sticky Ollama preference
- Few-shot injection on /insights/generate/agentic: compresses prior
  training_messages into trajectory blocks (tool calls + result summaries)
  and injects into the system prompt. Hardcoded default ids with optional
  request override.
- New fewshot_source_ids column on photo_insights (+ migration) to track
  which exemplars influenced a given row, for downstream training-set
  filtering. Chat amend rows stamp None with a lineage note.
- Ollama client now remembers which server (primary/fallback) most
  recently succeeded and tries it first on the next call, via a shared
  Arc<AtomicBool>. Avoids re-404ing the primary on every agent iteration
  when the chosen model only lives on the fallback.
- Demote noisy logs: daily_summary "Summary match" lines to debug;
  inner chat_with_tools non-2xx body log from error to warn (outer
  layer owns the terminal-error signal).
- Drift-guard tests for summarize_tool_result covering the success /
  empty / error / unknown shape for every tool.
- Tidy: three pre-existing clippy warnings cleaned up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:54:06 -04:00
Cameron
29f32b9d22 FFMPEG playlist improvements
Better playlist management, .tmp renaming, HLS playlist parameter and concurrency tweaking.
2026-04-24 10:08:03 -04:00
Cameron
13b9d54861 fix(scan): quiet startup scans & thumbnail RAW/HEIC
Three recurring issues on every full scan:

1. Video playlist scans re-enqueued every file only to reject it as
   AlreadyExists. Pre-filter in ScanDirectoryMessage and QueueVideosMessage
   so we skip videos whose .m3u8 already exists, and demote the leaked
   AlreadyExists log to debug.

2. image crate was built with only jpeg/png features, so webp/tiff/avif
   files logged "format not supported" every scan. Enable those features.

3. RAW (ARW/NEF/CR2/...) and HEIC thumbnails weren't generated, so the
   scan kept retrying them. Try the file's embedded JPEG preview via
   kamadak-exif first (fast, pure-Rust, works on Sony ARW where ffmpeg's
   TIFF decoder fails). Fall back to ffmpeg for HEIC/HEIF and RAWs with
   no preview. Anything still undecodable gets a <thumb>.unsupported
   sentinel so future scans skip it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 20:47:13 -04:00
Cameron
dc2a96162e fix(dates): prefer earliest of fs created/modified as fallback
On copied or restored files (e.g. a backup library), the OS stamps
created at copy time while modified is preserved from the source, so
the earlier of the two is a better proxy for when the content
originated. Adds utils::earliest_fs_time and threads it through the
three spots that fall back to filesystem dates: photos-list sort,
memories grouping, and insight-generation timestamp.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 17:20:12 -04:00
Cameron
d54419e779 style: cargo fmt drift
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 17:19:59 -04:00
Cameron
aa651d1c7b feat(ai): iteration budget in prompt + preserve photo-knowledge links
- Inject the max-iterations budget into the agentic system prompt for
  both insight generation and chat turns. Chat does this per-turn by
  appending a note to the replayed system message and restoring it
  before persistence so the note doesn't accumulate across turns.
- Stop deleting entity_photo_links at the start of agentic insight
  generation. The clear made recall_facts_for_photo always return
  empty, wasting a tool call and discarding knowledge from prior runs.
  Re-linking the same entity is already an INSERT OR IGNORE no-op.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 16:28:48 -04:00
Cameron
6831f50993 feat(ai): USER_NAME env + shared summary prompt + test-bin knobs
Introduces USER_NAME (default "Me") as the single source for the message
sender label and the first-person persona across daily summaries, SMS
context, insight generation, and chat. Eliminates the "Me:" transcript /
"what I did" ambiguity that confused smaller models, and unhardcodes
"Cameron" from prompt text + the knowledge-graph owner entity. Set
USER_NAME=Cameron in .env to preserve the existing owner entity row
(keyed on UNIQUE(name, entity_type)) — otherwise the next run creates
a fresh owner entity and orphans the existing facts/photo-links.

Also:
- search_messages redirect: when the model calls it with date/contact
  but no query, return a hint pointing at get_sms_messages instead of
  a bare missing-parameter error (prevents same-turn retry loops)
- sharpen search_messages vs get_sms_messages tool descriptions so
  content-vs-time-based intent is unambiguous
- extract build_daily_summary_prompt (+ DAILY_SUMMARY_MESSAGE_LIMIT,
  DAILY_SUMMARY_SYSTEM_PROMPT) shared by daily_summary_job and
  test_daily_summary binary — prompt tweaks now land in both
- EMBEDDING_MODEL const; fixes both insert sites that stored
  "mxbai-embed-large:335m" while generate_embeddings actually runs
  "nomic-embed-text:v1.5"
- test_daily_summary: add --num-ctx / --temperature / --top-p /
  --top-k / --min-p flags wired into OllamaClient setters, and print
  the configured knobs at the top of each run
- OllamaClient::generate now logs prompt/gen token counts and tok/s
  via log_chat_metrics (symmetric with chat_with_tools)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 23:39:37 -04:00
Cameron
e4a3536f87 feat(ai): search_messages tool + RAG reranker
Adds a search_messages tool that hits the Django FTS5/semantic/hybrid
endpoint for keyword-quality text search over message bodies, and an
LLM-based reranker inside tool_search_rag (gated by SEARCH_RAG_RERANK,
default on). Reranker pulls ~3x candidates from the vector index, asks
the chat model to rank by relevance, and falls back to vector order on
parse failure.

The reranker shares the active chat turn's OllamaClient so num_ctx and
sampling match — otherwise Ollama unloads/reloads the model on every
rerank call. (Unverified end-to-end; caught by inspection, awaiting
e2e confirmation.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 10:56:03 -04:00
Cameron
e51cd564a3 docs: chat continuation endpoints + env vars
Document the four new chat endpoints, SSE event shape, backend
routing rules, rewind semantics, amend mode, and the
AGENTIC_CHAT_MAX_ITERATIONS cap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 17:32:43 -04:00
Cameron
079cd4c5b9 feat(ai): streaming chat endpoint with live tool events
Add LlmClient::chat_with_tools_stream and SSE endpoint
POST /insights/chat/stream that emits text deltas, tool_call /
tool_result pairs, truncated notice, and a terminal done frame as the
agentic loop runs.

- Ollama: parses NDJSON from /api/chat stream, accumulates content
  deltas, emits Done with tool_calls from the final chunk.
- OpenRouter: parses OpenAI-compatible SSE, reassembles tool_call
  argument deltas by index, asks for stream_options.include_usage.
- InsightChatService spawns the loop on a tokio task, feeds events
  through an mpsc channel, persists training_messages at the end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 16:57:41 -04:00
Cameron
c2bd3c08e1 feat(ai): surface tool invocations in chat history
load_history now groups preceding tool_call + tool_result scaffolding
under each assistant reply as `tools: [{name, arguments, result}]`.
Result bodies over 2000 chars are truncated for payload size with a
`result_truncated` flag; the full value remains in training_messages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 16:03:53 -04:00
Cameron
65ab10e9a8 feat(ai): chat rewind + ollama metrics logging
Rewind: POST /insights/chat/rewind truncates training_messages at a
given rendered index, dropping the target message plus any preceding
tool-call scaffolding. The initial user prompt is protected.

Metrics: log prompt_eval_count/duration and eval_count/duration from
every Ollama chat response, rendered as tokens + ms + tok/s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 15:16:32 -04:00
Cameron
0b9528f61e feat(ai): chat continuation for photo insights (server v1)
Adds POST /insights/chat and GET /insights/chat/history. Replays the
stored agentic conversation through the same backend the insight was
generated with (or a per-turn override), runs a short tool-calling
loop, and persists the extended history in append or amend mode.

Backend switching: same-backend or hybrid->local replay verbatim;
local->hybrid is rejected in v1 (would require on-the-fly vision
description rewrite).

Per-(library, file) async mutex serialises concurrent turns. Soft
context budget drops oldest tool_call+result pairs when the
serialized history exceeds num_ctx - 2048 tokens.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 13:00:27 -04:00
Cameron
e2eefbd156 feat(ai): curated OpenRouter model picker for hybrid backend
Add OPENROUTER_ALLOWED_MODELS env var and GET /insights/openrouter/models
endpoint returning the curated list verbatim. Drop the live capability
precheck in hybrid mode — trust the operator's allowlist; bad ids surface
as a chat-call error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:36:19 -04:00
Cameron
3ac0cd62eb feat(ai): hybrid backend mode for agentic insights
Adds a `backend` column to photo_insights (default 'local', migration
2026-04-20-000000) and a corresponding optional `backend` field on the
agentic request. When a request sets backend=hybrid:

- The local Ollama vision model is called once via describe_image to
  produce a text description.
- The description is inlined into the first user message as text —
  no base64 image is ever sent to the chat model.
- The agentic tool-calling loop and title generation route through an
  OpenRouterClient (dispatched via &dyn LlmClient), letting the user
  pick any tool-capable model from OpenRouter per request.
- describe_photo is removed from the offered tools since the description
  is already present.

Embeddings and vision stay on local Ollama regardless of backend.
Hybrid mode requires OPENROUTER_API_KEY; handlers return a clear error
when hybrid is requested without it, and also when the selected
OpenRouter model lacks tool-calling support.

AppState gains an optional openrouter client built from
OPENROUTER_API_KEY / OPENROUTER_BASE_URL / OPENROUTER_DEFAULT_MODEL /
OPENROUTER_EMBEDDING_MODEL / attribution headers. Default model is
anthropic/claude-sonnet-4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 22:30:40 -04:00
Cameron
e799ba716c feat(ai): add OpenRouterClient implementing LlmClient
OpenAI-compatible client for OpenRouter. Translates canonical wire shapes at
the boundary: tool-call arguments stringify on send / parse on receive
(accepting both string and native-object forms); images rewritten from the
base64 images field into content-parts with image_url entries; role=tool
messages inherit tool_call_id from the preceding assistant's tool calls.

/models parsed into ModelCapabilities via supported_parameters (tool use)
and architecture.input_modalities (vision). 15-minute capabilities cache.
Bearer auth; HTTP-Referer / X-Title attribution headers optional.

Not wired into request routing yet — first consumer arrives with hybrid
backend mode. 11 unit tests cover the translation helpers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 22:18:29 -04:00
Cameron
0073409b3d refactor: introduce LlmClient trait (no-op)
Preparation for a second LLM backend (OpenRouter) and hybrid vision-local /
chat-remote mode. Shared wire types (ChatMessage, Tool, ToolCall, etc.) move
into a new src/ai/llm_client.rs and are re-exported from ollama.rs so
existing imports keep working. OllamaClient now implements LlmClient.

No behavior change; callers still hold the concrete OllamaClient. Caller
migration to Arc<dyn LlmClient> is deferred to the PR that wires hybrid
backend routing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 22:11:05 -04:00
702aa8078c Merge pull request '004 Multi-library Support' (#54) from 004-multi-library into master
Reviewed-on: #54
2026-04-21 01:55:22 +00:00
Cameron
bffe604527 Remove potentially confusing TZ from insight generator 2026-04-21 01:55:07 +00:00
Cameron
39c212b0e6 Bump to 1.0.0 for multi-library support 2026-04-21 01:55:07 +00:00
Cameron
a35b45fd36 feat: expand insight tool result caps and render timestamps in local time
Doubled default row caps for search_rag/get_sms_messages/get_calendar_events/recall_entities and exposed an optional `limit` parameter on each so the agent can tune per call. Render all LLM-facing timestamps as server-local time with explicit offset so smaller models stop misreading UTC as wall-clock time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
3027a3ffda perf: DB-backed recursive /photos + watcher reconciliation
Recursive listings now query image_exif instead of walking disk, taking
union-mode /photos from ~17s to sub-second on a 10k-file library. The
watcher's full scan prunes stale image_exif rows so the DB stays in
parity with the filesystem when files are deleted externally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
4a775b5e9b test: cover resolve_library_param and per-library ExifDao filter
Adds 9 unit tests around the library plumbing:
- resolve_library_param branches (absent, empty/whitespace, numeric id,
  name, unknown id, unknown name)
- Library::resolve symmetry with strip_root
- ExifDao::get_all_with_date_taken in union and scoped modes

Introduces SqliteExifDao::from_connection test constructor mirroring the
existing preview_dao pattern so DAO tests can drive an in-memory SQLite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
b04dd8b601 fix: demote path-not-exists validation errors to debug
The /image cross-library fallback tries the resolved library first and falls
back to any library holding the rel_path. The first attempt emitted error-level
noise on every grid tile in union mode. Split the validation error so only
traversal attempts log at error; missing-file cases log at debug.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
2c8de8dcc6 feat: union /photos and /memories across libraries
When `library` is omitted, both endpoints now walk every configured
library root, interleave the results, and tag each row with its source
library via the parallel `photo_libraries` / per-row `library_id`
arrays. Previously the handlers fell back to the primary library,
silently hiding the rest.

Threads a parallel `file_libraries: Vec<i32>` through the sort/paginate
helpers so library attribution survives sorting and pagination.
Directory names are de-duplicated across libraries.

`get_all_with_date_taken` grows an optional library filter so memories
can scope its EXIF query per-library during the union walk.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
586b735af5 feat: include per-photo library id in /photos response
Adds a parallel `photo_libraries: Vec<i32>` array alongside `photos`
in `PhotosResponse` so clients can render per-thumbnail badges.
Populated with the scoped library id at the two main return sites;
left empty for `/favorites` since favorites are library-agnostic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
c2ee3996be chore: apply cargo fmt + clippy cleanup across crate
Silence forward-looking dead_code on unused DAO modules, annotate
individual placeholder items, rewrite tautological assert!(true/false)
in token tests as panic! arms, and pick up fmt drift.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
a0f3bfab5f fix: validate gps-summary path against every library
The /photos/gps-summary handler validated the incoming path against
the primary library's root with new_file=false, which requires the
path to exist on disk. For a viewer opened on a file from a
non-primary library, tapping the GPS link produced activePath =
<folder from lib 2>, the primary-only check failed, and the server
400'd — so the map came up empty.

Validation here is purely a traversal guard (the DAO does a prefix
LIKE against rel_path), so we now accept the path as long as any
configured library can resolve it without escaping its root.

Also applies cargo fmt drift on files touched this session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
54a1df60b8 fix: resolve preview clip rel_path against all libraries
PreviewClipGenerator stripped a single base_path, so videos in a
non-primary library ended up with the absolute path as 'relative'.
On Windows, PathBuf::from(preview_clips_dir).join(absolute) replaces
with the absolute path, and .with_extension("mp4") on a .mp4 input
yields the input path — ffmpeg then errors out with 'cannot edit
existing files in place'.

The generator now holds Vec<Library> and strips whichever root
actually contains the video, with separator normalization to match
the rest of the code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
7becbc0737 fix: normalize rel_path separators in non-recursive /photos listing
On Windows, strip_prefix preserves backslashes, so the non-recursive
branch was looking up tags for 'Melissa\img1.jpg' while tagged_photo
stores 'Melissa/img1.jpg' — every file was filtered out. Normalize to
'/' to match the watcher and populate_knowledge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
e6ee38edec fix: resolve media across libraries for video, metadata, and insights
The /video/generate and /image/metadata handlers assumed files live under
the resolved library only, which broke when a mobile client passed no
library (union mode) but the file lived in a non-primary library. Both
now fall back to scanning every configured library for an existing file.

InsightGenerator held a single base_path, so vision-model loads and
filename-date fallbacks failed for non-primary libraries. It now takes
Vec<Library> and probes each root in resolve_full_path.

/image/metadata responses now carry library_id/library_name so the
mobile viewer can surface which library a file belongs to.

Thumbnail generation at startup is now spawned on a background thread
so the HTTP server can accept traffic while large libraries backfill.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
2d942a9926 feat: content-hash-aware tag/insight sharing + library scoping
Tags and insights now follow content across libraries via content_hash
lookups on the read path, so the same file indexed at different rel_paths
in multiple libraries shares its annotations. Recursive tag search scopes
hits to the selected library by checking each tagged rel_path against
the library's disk (with a content-hash sibling fallback so tags attached
under one library's rel_path still match a content-equivalent file in
another). The /image and /image/metadata handlers fall back across
libraries when the file isn't under the resolved one, so union-mode
search results (which carry no library attribution in the response)
still serve correctly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
c01a0479b7 fix: honor library param in /image, /photos, /memories
The Phase 3 plumbing accepted `library=` but didn't actually route
requests through the scoped library once it was resolved. Three
concrete bugs surfaced when testing against a second mounted library:

- `/image` always resolved paths against AppState.base_path (primary),
  so thumbnails for non-primary libraries 400'd when their rel_paths
  didn't exist under primary. Now resolves against the scoped library
  and defaults to primary when the param is omitted.

- `/memories` walked the scoped library correctly but its helper
  functions hardcoded `library_id: PRIMARY_LIBRARY_ID` on every
  MemoryItem, causing clients to route thumbnails back to primary
  regardless of which library the memory actually came from.

- `/photos` non-recursive listing delegated to a `RealFileSystem`
  constructed from AppState.base_path at startup, so walks always
  hit primary even when `library=2` was passed. The non-primary
  path now uses list_files against the scoped library's root;
  primary still goes through FileSystemAccess to preserve the
  existing test mock plumbing.

Also adds `library` to ThumbnailRequest so the /image query param
is actually parsed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
0aaea91cc2 feat: add content_hash backfill + register every media file
Adds blake3 content hashing as the basis for derivative dedup
(thumbnails, HLS) across libraries. Computed inline by the watcher on
ingest and by a new `backfill_hashes` binary for historical rows.

Key changes:
- `content_hash` and `size_bytes` are now populated on new image_exif
  rows; a new ExifDao surface (`get_rows_missing_hash`,
  `backfill_content_hash`, `find_by_content_hash`) supports backfill and
  future hash-keyed lookups.
- The watcher now registers every image/video in image_exif, not just
  files with parseable EXIF. EXIF becomes optional enrichment; videos
  and other non-EXIF files still get a hashed row. This also makes
  DB-indexed sort/filter cover the full library.
- `/image` thumbnail serve dual-looks up hash-keyed path first, then
  falls back to the legacy mirrored layout.
- Upload flow accepts `?library=` query param + hashes uploaded files.
- Store_exif logs the underlying Diesel error on insert failure so
  constraint violations surface instead of hiding behind a generic
  InsertError.
- New migration normalizes rel_path separators to forward slash across
  all tables, deduplicating any rows that collide after normalization.
  Fixes spurious UNIQUE violations from mixed backslash/forward-slash
  paths on Windows ingest.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
ce5b337582 feat: make file watcher, thumbnails, and upload library-aware
`watch_files` and `create_thumbnails` now iterate every configured
library, tagging rows with the correct `library_id`. `process_new_files`
takes a `&Library` so InsertImageExif no longer hardcodes the primary
library. Upload accepts an optional `library` query param to pick a
target library; omitted still defaults to primary for backwards
compatibility.

Hash-keyed thumbnail/HLS storage with dual-lookup fallback is deferred
to Phase 5, where it's bundled with the content hash backfill that
actually makes the hash-keyed paths meaningful. Until hashes are
populated, the legacy mirrored layout is a no-op to change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
48e5de6eab feat: add GET /libraries and library query param plumbing
New `/libraries` endpoint returns configured libraries so clients can
discover them. `FilesRequest` and `MemoriesRequest` gain an optional
`library` param (accepts name or numeric id). Unknown values are
rejected with 400; absent values span all libraries. `/memories`
now scopes its filesystem walk + EXIF query to the resolved library.
`MemoryItem` carries `library_id` so union-mode clients can render a
per-item source badge.

Behavior is unchanged in single-library mode: omitting `library` still
returns results from the primary library, which is the only one
configured until a second row is added to the libraries table.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
Cameron
ffcddbb843 feat: multi-library foundation (schema + libraries module)
Adds a `libraries` registry table and threads library_id through
per-instance metadata tables (image_exif, photo_insights,
entity_photo_links, video_preview_clips). File-path columns renamed to
rel_path to make the relative-to-root semantics explicit. Adds
content_hash + size_bytes on image_exif to support future hash-keyed
thumbnail/HLS dedup. Tags and favorites stay library-agnostic so they
share across libraries by rel_path.

Behavior is unchanged: a single primary library (id=1) is seeded from
BASE_PATH on first boot; all handlers and DAOs route through it as a
transitional shim until the API gains a library query param.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 01:55:07 +00:00
2f4edba08c Merge pull request '003-knowledge-memory' (#55) from 003-knowledge-memory into master
Reviewed-on: #55
2026-04-21 01:54:34 +00:00
Cameron
8bc948b297 Insight prompt tweaks 2026-04-17 11:55:33 -04:00
Cameron
b7e1bdf1fd feat: add sampling param CLI flags to populate_knowledge binary
Adds --temperature, --top-p, --top-k, --min-p flags so batch runs can
tune the same sampling params now supported by the API endpoints.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 14:23:45 -04:00
Cameron
3059adfd37 Add missing DB migration sql for training data 2026-04-15 09:28:53 -04:00
Cameron
b599f7a34b feat: add temperature, top_p, top_k, min_p params to insight generation
Expose Ollama sampling params through the insight generation endpoints
so users can tune creativity/determinism per request. All four are
optional — omitted values fall through to the model's server-side
defaults.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 09:27:59 -04:00
Cameron
c703a47f17 Add the ability to rate insights to curate training data 2026-04-13 09:23:40 -04:00
Cameron
da16fddce3 Address path traversal and other security fixes 2026-04-10 14:58:57 -04:00
Cameron
e1c32b6584 Tweak Prompt 2026-04-10 14:30:31 -04:00
Cameron
65e938035f fix: reduce duplicate entities from weak model inconsistency
Adds normalize_entity_type() which lowercases and canonicalises synonyms
(location→place, human→person, etc.) before every upsert. The SQL lookup
now uses lower(entity_type) on both sides so existing dirty rows (Person,
Location) correctly deduplicate against normalised writes without a migration.

Adds a pre-flight similarity check in tool_store_entity: before upserting,
searches active entities of the same type using the first name token. Any
non-exact matches are appended to the tool response so the agentic loop
can choose to reuse an existing entity ID rather than create a duplicate.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 18:27:09 -04:00