ImageApi

Author	SHA1	Message	Date
Cameron Cordes	d123cde333	knowledge: entity-graph endpoint for force-directed view New GET /knowledge/graph?type=&limit= returns the data the curation UI's graph tab needs: - nodes = entities with at least one in-scope fact (rejected / superseded excluded). Carries fact_count for visual sizing. Top-N by count desc; default cap 200 (clamped 1..1000). - edges = relational facts (object_entity_id set) grouped by (subject, object, predicate) so 3 "is_friend_of" facts between the same pair collapse into one edge with count=3. Two raw SQL queries: an INNER JOIN onto a persona-scoped fact- count subquery for nodes (skips 0-fact entities entirely so the sim doesn't waste time on disconnected islands), then a follow- up GROUP BY over the persona-scoped fact set restricted to the node id set via IN clauses (ids are i32 so inlining is safe). Pairs with the Apollo-side GraphPanel that runs d3-force over the returned payload and renders SVG with click-to-open. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 21:26:02 -04:00
Cameron Cordes	6dca0c027d	fmt: cargo fmt sweep No logic changes - line reflow, brace placement, and method-chain splits across handlers / personas / state / faces / knowledge / insights_dao / knowledge_dao / populate_knowledge. Picked up incidentally while running fmt for the sms-search work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 19:21:00 -04:00
Cameron Cordes	6620fa48d7	knowledge: consolidation proposals endpoint Finds near-duplicate entities the upsert-time cosine guard didn't catch — typically legacy data from before that guard landed, or pairs whose embeddings sit between 0.85 (default proposal floor) and 0.92 (auto-collapse threshold). Pure read-side feature; the actual merging still goes through the existing /knowledge/entities/merge action. New DAO method `find_consolidation_proposals(threshold, max_groups)`: - Loads every non-rejected entity with an embedding. - Partitions by entity_type so a person can't cluster with a place. - Pairwise cosine, edges above threshold feed a union-find for transitive grouping (Sara → Sarah → Sarah J. all land in one cluster). - Tracks min/max cosine per component so the UI can show "how tight" each cluster is before clicking in. - Returns groups of >= 2 sorted by size desc then max cosine desc; trimmed to `max_groups`. New endpoint `GET /knowledge/consolidation-proposals?threshold= &limit=` accepts the threshold (clamped 0.5–0.99 to prevent the "every entity in one mega-cluster" case) and returns groups with per-entity persona fact-count breakdowns baked in — saves the UI a separate query per cluster member. ConsolidationGroup is exported through database/mod.rs so the handler can use it without depending on knowledge_dao internals. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 18:43:11 -04:00
Cameron Cordes	89d0a6527c	knowledge: per-entity persona breakdown for list + detail Entities are global; facts are persona-scoped. Under the active persona an entity can read as "0 facts" while having plenty under other personas the user owns — the curation UI had no way to surface that gap. Adds a batched DAO method `get_persona_breakdowns_for_entities` that returns {entity_id → [(persona_id, count)]} in one query (group by subject + persona, user-scoped, status != rejected), and wires it into both /knowledge/entities list rows and GET /knowledge/entities/{id}. EntitySummary grows an optional `persona_breakdown` field (skipped on serialization when None — keeps PATCH responses unchanged). EntityDetailResponse carries the breakdown as a non-optional Vec since the detail endpoint always populates it. One extra query per list page (50 entities → 50 subject ids batched in one IN clause); single-entity GET adds one round trip. Indexed by (subject_entity_id, persona_id) implicitly via the existing user-persona indexes on entity_facts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 18:29:20 -04:00
Cameron Cordes	f200466508	knowledge: forbid markdown in synthesized merge descriptions System prompt now explicitly enumerates the markdown forms the model shouldn't emit (bold, italics, headings, bullets, lists, code fences) on top of the existing "no preamble, no quotes" constraints. Some local models default to markdown-shaped output for descriptions and the curation UI is plain-text, which would render the asterisks and hashes literally. The output cleaning step picks up a parallel sweep: strip code fences, leading bullets / headings, wrapping quotes, and naive inline emphasis markers (** and __). Rare enough that the plain-replace is fine; not trying to parse markdown. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 16:49:02 -04:00
Cameron Cordes	afac02cade	knowledge: synthesize-merge endpoint for LLM-curated descriptions New POST /knowledge/entities/synthesize-merge { source_id, target_id } that calls the local Ollama with both entities' names + descriptions and returns a synthesized merged-description draft. Read-only on the database — the curation UI uses the response as the editable seed in the merge picker; the actual merge still requires a follow-up PATCH-target-description + POST /merge. The handler drops the KnowledgeDao lock before the LLM call so other knowledge reads aren't blocked while generation runs (typically seconds). Failure mode is 503 with an explicit hint that the UI should fall back to skip-synthesis — keeps the merge action working when the model is offline. Output is lightly cleaned (leading "Merged description:" / surrounding quotes stripped) since small models reach for those patterns even with explicit "no preamble" guidance. Heavier parsing isn't worth it — the curator edits anyway. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 16:37:26 -04:00
Cameron Cordes	fd4dd89bbb	knowledge: agent self-correction with audit + per-persona gate + revert Bundles three coupled changes so agent-side mutations stay auditable and reversible: 1. Audit columns on entity_facts — `last_modified_by_model` / `last_modified_by_backend` / `last_modified_at`. Stamped on every mutation path (update_fact, supersede_fact, manual PATCH, manual supersede, the new revert). NULL on rows never touched since creation. Partial index on `last_modified_at WHERE NOT NULL` keeps the "show me recent edits" feed fast without bloating from legacy rows. 2. Per-persona gate `personas.allow_agent_corrections` (BOOLEAN, default 0). Defense in depth at two layers: - build_tool_definitions: when off, `update_fact` and `supersede_fact` aren't in the catalog at all, so even a hallucinated tool call by the model fails fast. - tool_update_fact / tool_supersede_fact: re-checks the persona flag at call time and returns an explicit "corrections disabled" error if it's somehow off (e.g. flag flipped mid- loop). ToolGateOpts grows the flag; current_gate_opts splits into `current_gate_opts` (no persona context, defaults closed) + `current_gate_opts_for_persona` for chat callers that have a persona id. Both call sites in insight_chat are updated. 3. Revert action — new DAO method `revert_supersession` + `POST /knowledge/facts/{id}/restore`. Flips status back to 'active', clears `superseded_by`, clears `valid_until` (we don't track whether it was hand-set vs auto-stamped, so the safe reset is to drop it — user can re-bound after). Stamps `last_modified_` so the revert itself is attributable. Manual paths (PATCH / supersede via HTTP, plus restore) stamp the audit columns with `("manual", "manual")`. Agent paths stamp the loop-time chat model and backend (mirroring the existing created_by_ convention). FactDetail in the HTTP response now carries the audit triple alongside the existing provenance. Apollo wires the new field set in the matching commit. PersonaView / UpdatePersonaRequest grow `allowAgentCorrections`; the PersonaPatch + InsertPersona + bulk_import paths thread it. 317 lib tests pass, including unchanged update_fact / supersede DAO tests (now passing audit=None — None means "no provenance context to attribute", legacy semantics). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 20:56:56 -04:00
Cameron Cordes	f53338923d	knowledge: stamp model + backend on facts for audit Adds two nullable TEXT columns to entity_facts — `created_by_model` (LLM identifier) and `created_by_backend` ("local" / "hybrid" / "manual" / NULL) — so the curator can audit which configurations produce good fact-keeping and which produce noise. photo_insights already carries model_version + backend, and entity_facts.source_insight_id links to it, but: - source_insight_id is set post-loop, so chat-continuation and regenerated-insight facts lose the link. - JOINing per read is more friction than embedding provenance on the row itself. - Manual facts (POST /knowledge/facts) have no insight at all and need their own "manual" provenance marker. Threading: execute_tool grows `model` + `backend` params, passed from the three call sites (agentic insight loop, chat single-turn, chat stream) using the loop-time `chat_backend.primary_model()` + `effective_backend` already in scope. tool_store_fact stamps the new fact accordingly; manual create_fact stamps backend="manual". Legacy rows leave both NULL — pre-tracking data can't be back- filled reliably from training_messages without burning compute. Indexes are partial (WHERE NOT NULL) so legacy rows don't bloat them, and "show me all facts from model X" stays fast. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 20:05:14 -04:00
Cameron Cordes	85f3716379	knowledge: fact supersession + photo-date valid_from Two Phase-2 followups in one commit since they're coupled at the write path: * Agent populates valid_from from the source photo's date_taken when calling store_fact. Loose semantics — date_taken is evidence at that date, not strictly when the fact started being true — but gives the curator a calendar anchor and pairs with supersession to close intervals cleanly. valid_until stays NULL (a single photo can't tell us when something stopped). Honours the existing upsert_fact dedup (corroborated facts keep their first-recorded valid_from). * Supersession: new column entity_facts.superseded_by INTEGER (migration 2026-05-10-000200), new status value 'superseded', new DAO method supersede_fact, new HTTP endpoint POST /knowledge/facts/{id}/supersede. Marking an old fact as replaced by a new one atomically: flips status to 'superseded', sets superseded_by, and stamps valid_until from the new fact's valid_from (when not already set). delete_fact clears dangling supersession pointers in the same transaction so the column never points at a missing row — no FK because SQLite can't ALTER ADD with REFERENCES, but the DAO maintains the invariant. Pairs with conflict detection from the previous slice: once the old fact's valid_until is closed, its interval no longer overlaps the new fact's, so they stop flagging — the supersede action resolves the conflict. Two tests pin the contract: supersede stamps valid_until from new.valid_from while respecting an existing valid_until, and deleting the supersedeR clears the dangling pointer while leaving the old fact's 'superseded' status in place for history. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:47:06 -04:00
Cameron Cordes	01f5ad7527	knowledge: valid-time on facts + interval-aware conflict detection Adds bitemporal support to entity_facts. Existing `created_at` is transaction time (when we recorded the fact); the new `valid_from` / `valid_until` BIGINT columns are valid time (when the fact is/was true in the real world). NULL on either side = unbounded on that side, both NULL = "always-true / unknown" — matches the default state of every legacy row, no backfill needed. The split matters for time-bounded predicates like is_in_relationship_with / lives_in / works_at: recording the fact once doesn't mean the relationship is still ongoing. Same predicate across different windows ("lives_in NYC 2018-2020", "lives_in SF 2020-present") is no longer a conflict — the interval-aware check in get_entity only flags pairs whose windows overlap. Facts with no valid-time data still flag against everything (worst case for legacy rows — user adds dates to suppress). API surface: - POST /knowledge/facts accepts optional valid_from / valid_until. - PATCH /knowledge/facts/{id} accepts both with tri-state semantics: field omitted = leave alone, JSON null = clear to NULL, number = set. Implemented via a small serde helper around Option<Option>. - GET /knowledge/entities/{id} surfaces both fields per fact and uses them in conflict detection. Agent path (insight_generator) writes NULL/NULL for now — deriving valid_from from the source photo's date_taken is slated for a follow-up agent tool alongside Phase 2's supersession. Test pins set + clear semantics via update_fact: setting both bounds, leaving them alone on a subsequent patch, then clearing valid_until back to NULL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:25:55 -04:00
Cameron Cordes	bcd5312953	knowledge: detect same-predicate object conflicts at read time GET /knowledge/entities/{id} now flags facts as `in_conflict` when another active fact shares the same predicate but disagrees on the object (entity id or text value). Pure read-time computation in the handler — group facts by predicate, distinct-object count > 1 flags all members. No schema change; same shape as `is_current` on photo insights. The flag is intentionally a signal, not a hard constraint. Some predicates are legitimately multi-valued (friend_of, tagged_in, appears_in) — the curator UI surfaces the amber accent and lets the user reject the stale fact, accept both, or supersede one later once the supersession column lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:14:58 -04:00
Cameron Cordes	0b8478a5e4	knowledge: list sort + persona-scoped fact_count per entity Two related additions to /knowledge/entities: - New EntitySort enum (UpdatedDesc default, NameAsc, FactCountDesc) surfaced via `?sort=updated\|name\|count`. NameAsc clusters near- duplicate names so dupes stand out at a glance; FactCountDesc surfaces heavily-used entities and demotes 0-fact noise to the bottom. - New `list_entities_with_fact_counts` DAO method that returns each entity alongside a persona-scoped count of its non-rejected facts (subject side). Persona scope follows X-Persona-Id via the existing resolve_persona_filter chain — Single filters on (user_id, persona_id), All unions across the user's personas. Implemented as one raw SQL query with a LEFT JOIN to a fact-count subquery and ORDER BY tied to the chosen sort, so count-sort needs no second round trip. The agent's existing list_entities call site is unchanged — it doesn't need persona-scoped counts and the trait method stays cheap. EntitySummary grows an Option<i64> fact_count (skip_serializing_if none) so PATCH responses stay shaped as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 16:04:13 -04:00
Cameron Cordes	f7ce3d2b22	knowledge: include library_id in photo_links response The PhotoLinkDetail in /knowledge/entities/{id} was dropping the library_id field, leaving consumers no way to construct a content-routed thumbnail URL. Apollo's curation screen was falling through to library=0 (the FastAPI default) and getting 400s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 15:19:37 -04:00
Cameron Cordes	d7aee4f228	knowledge: cosine dedup, fact create endpoint, recall nudge Phase 1 of the knowledge curation work. Three small server-side changes to support an Apollo-side curation surface and reduce the agent's near- duplicate output rate going forward: - upsert_entity grows an embedding-cosine fallback after the exact name match misses. New entities whose embedding sits above ENTITY_DEDUP_COSINE_THRESHOLD (default 0.92) against any same-type active entity collapse onto the existing row. Eliminates the Sarah / Sara / Sarah J. trio the FTS5 prefix check was missing. - POST /knowledge/facts symmetric with the existing PATCH/DELETE so the curation UI can create facts directly. Persona-scoped via X-Persona-Id; validates subject (and optional object) entity existence; reuses KnowledgeDao::upsert_fact so corroboration semantics match the agent path. - One sentence in build_system_content telling the agent to call recall_entities before store_entity when a name resembles something already known. Cheap; complements the DAO-layer guard. Includes upsert_entity_collapses_near_duplicate_by_embedding test covering both the collapse-on-near-match path and the don't-collapse-on- unrelated-embedding path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 15:16:05 -04:00
Cameron Cordes	fbd769e475	personas: composite FK + built-in update guard Two persona-infrastructure correctness fixes that go together because the second one (FK with CASCADE) requires the first (preventing the persona row from being mutated out from under its facts). 1. update_persona handler refuses name/systemPrompt edits to built-ins (409). includeAllMemories stays editable — that's a per-user preference, not the persona's identity. Mirrors the existing delete_persona guard. The DAO is intentionally permissive so the guard sits at the HTTP layer; persona_dao test pins that contract. 2. Migration 2026-05-10 adds user_id to entity_facts and a composite FK (user_id, persona_id) -> personas(user_id, persona_id) ON DELETE CASCADE. This closes two issues at once: - Persona orphans: deleting a custom persona used to leave its facts dangling forever, readable only via PersonaFilter::All. CASCADE now wipes them with the persona row. - Multi-user fact leakage: PersonaFilter::Single("default") used to surface every user's default-scoped facts. PersonaFilter is now { user_id, persona_id } and all read paths (get_facts_for_entity, list_facts, get_recent_activity) filter on user_id first. upsert_fact's dedup key extends to user_id so identical claims under shared persona names from different users no longer corroborate-bump each other's confidence. - user_id threads from Claims.sub.parse::<i32>().unwrap_or(1) at the chat / insight handlers through ChatTurnRequest, the streaming agentic loop, execute_tool, and into the leaf tools (tool_store_fact, tool_recall_facts_for_photo). The ".unwrap_or(1)" accommodates Apollo's service token whose sub is non-numeric on legacy mints. - Backfill picks the smallest user_id matching each legacy fact's persona_id so the FK holds for already-stored rows. Five new knowledge_dao tests with FK-on connection: persona scoping isolation, All-variant union per-user, dedup not crossing users, CASCADE delete, FK rejection of unknown personas. Plus dao_update_does_not_block_built_ins documenting where the HTTP-layer guard lives. Apollo coordinates separately — the matching changes there add the /api/personas proxy and start sending persona_id on photo-chat turns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 13:30:35 -04:00
Cameron Cordes	3e2f36a748	personas: elevate to server with per-persona fact scoping Move personas off the mobile client into ImageApi as first-class records, and scope entity_facts by persona so each one builds its own voice over a shared entity graph. The new include_all_memories flag lets a persona opt back into the full hive-mind pool for human browsing of /knowledge/*; agentic generation always stays in-voice. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 17:59:20 -04:00
Cameron	191ccc0d77	feat: add entity-relationship knowledge memory to agentic insights Implements persistent cross-photo knowledge memory so the agentic insight loop can learn and recall facts about people, places, and events across the photo collection. Changes: - photo_insights: drop UNIQUE(file_path) + INSERT OR REPLACE, replace with append-only rows + is_current flag for insight history retention - New tables: entities, entity_facts, entity_photo_links with FK constraints and confidence scoring - KnowledgeDao trait + SqliteKnowledgeDao with upsert, merge, and corroboration (confidence +0.1 on duplicate fact detection) - Four new agent tools: recall_entities, recall_facts_for_photo, store_entity, store_fact (with object_entity_id FK support) - Cameron entity auto-seeded with stable ID injected into system prompt - Pre-run photo link clearing + post-loop source_insight_id backfill - Audit REST API: GET/PATCH/DELETE /knowledge/entities/{id}, POST /knowledge/entities/merge, GET/PATCH/DELETE /knowledge/facts/{id}, GET /knowledge/recent Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-03 17:27:49 -04:00

17 Commits