knowledge: stamp model + backend on facts for audit

Adds two nullable TEXT columns to entity_facts —
`created_by_model` (LLM identifier) and `created_by_backend`
("local" / "hybrid" / "manual" / NULL) — so the curator can audit
which configurations produce good fact-keeping and which produce
noise.

photo_insights already carries model_version + backend, and
entity_facts.source_insight_id links to it, but:
  - source_insight_id is set post-loop, so chat-continuation and
    regenerated-insight facts lose the link.
  - JOINing per read is more friction than embedding provenance on
    the row itself.
  - Manual facts (POST /knowledge/facts) have no insight at all and
    need their own "manual" provenance marker.

Threading: execute_tool grows `model` + `backend` params, passed
from the three call sites (agentic insight loop, chat single-turn,
chat stream) using the loop-time `chat_backend.primary_model()` +
`effective_backend` already in scope. tool_store_fact stamps the
new fact accordingly; manual create_fact stamps backend="manual".
Legacy rows leave both NULL — pre-tracking data can't be back-
filled reliably from training_messages without burning compute.

Indexes are partial (WHERE NOT NULL) so legacy rows don't bloat
them, and "show me all facts from model X" stays fast.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-05-10 20:05:14 -04:00
parent 85f3716379
commit f53338923d
8 changed files with 90 additions and 2 deletions

View File

@@ -119,6 +119,10 @@ pub struct FactDetail {
/// supersession, migration 2026-05-10-000200). Only set when
/// status == 'superseded'.
pub superseded_by: Option<i32>,
/// Provenance — see migration 2026-05-10-000300. NULL on legacy
/// rows. `created_by_backend` is "local" / "hybrid" / "manual".
pub created_by_model: Option<String>,
pub created_by_backend: Option<String>,
/// Set when another active fact has the same subject+predicate,
/// a different object, AND their valid-time intervals overlap.
/// Detected at read time by the get_entity handler grouping
@@ -432,6 +436,8 @@ async fn get_entity<D: KnowledgeDao + 'static>(
valid_from: f.valid_from,
valid_until: f.valid_until,
superseded_by: f.superseded_by,
created_by_model: f.created_by_model,
created_by_backend: f.created_by_backend,
in_conflict: false,
});
}
@@ -768,6 +774,11 @@ async fn create_fact<D: KnowledgeDao + 'static>(
valid_from: body.valid_from,
valid_until: body.valid_until,
superseded_by: None,
// Manual creation via curation UI — provenance recorded as
// "manual" with no model, distinguishing user-entered facts
// from agent-generated ones in the audit view.
created_by_model: None,
created_by_backend: Some("manual".to_string()),
};
match dao.upsert_fact(&cx, insert) {