knowledge: agent self-correction with audit + per-persona gate + revert
Bundles three coupled changes so agent-side mutations stay
auditable and reversible:
1. Audit columns on entity_facts —
`last_modified_by_model` / `last_modified_by_backend` /
`last_modified_at`. Stamped on every mutation path
(update_fact, supersede_fact, manual PATCH, manual supersede,
the new revert). NULL on rows never touched since creation.
Partial index on `last_modified_at WHERE NOT NULL` keeps the
"show me recent edits" feed fast without bloating from legacy
rows.
2. Per-persona gate `personas.allow_agent_corrections` (BOOLEAN,
default 0). Defense in depth at two layers:
- build_tool_definitions: when off, `update_fact` and
`supersede_fact` aren't in the catalog at all, so even a
hallucinated tool call by the model fails fast.
- tool_update_fact / tool_supersede_fact: re-checks the persona
flag at call time and returns an explicit "corrections
disabled" error if it's somehow off (e.g. flag flipped mid-
loop).
ToolGateOpts grows the flag; current_gate_opts splits into
`current_gate_opts` (no persona context, defaults closed) +
`current_gate_opts_for_persona` for chat callers that have a
persona id. Both call sites in insight_chat are updated.
3. Revert action — new DAO method `revert_supersession` +
`POST /knowledge/facts/{id}/restore`. Flips status back to
'active', clears `superseded_by`, clears `valid_until` (we
don't track whether it was hand-set vs auto-stamped, so the
safe reset is to drop it — user can re-bound after). Stamps
`last_modified_*` so the revert itself is attributable.
Manual paths (PATCH / supersede via HTTP, plus restore) stamp the
audit columns with `("manual", "manual")`. Agent paths stamp the
loop-time chat model and backend (mirroring the existing
created_by_* convention).
FactDetail in the HTTP response now carries the audit triple
alongside the existing provenance. Apollo wires the new field set
in the matching commit.
PersonaView / UpdatePersonaRequest grow `allowAgentCorrections`;
the PersonaPatch + InsertPersona + bulk_import paths thread it.
317 lib tests pass, including unchanged update_fact / supersede
DAO tests (now passing audit=None — None means "no provenance
context to attribute", legacy semantics).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -18,6 +18,7 @@ pub struct PersonaPatch {
|
||||
pub system_prompt: Option<String>,
|
||||
pub include_all_memories: Option<bool>,
|
||||
pub reviewed_only_facts: Option<bool>,
|
||||
pub allow_agent_corrections: Option<bool>,
|
||||
}
|
||||
|
||||
/// One row of a bulk migration upload. Fields named to match the JSON
|
||||
@@ -166,6 +167,7 @@ impl PersonaDao for SqlitePersonaDao {
|
||||
created_at: now,
|
||||
updated_at: now,
|
||||
reviewed_only_facts: false,
|
||||
allow_agent_corrections: false,
|
||||
})
|
||||
.execute(conn.deref_mut())
|
||||
.map_err(|e| anyhow::anyhow!("Insert error: {}", e))?;
|
||||
@@ -222,6 +224,15 @@ impl PersonaDao for SqlitePersonaDao {
|
||||
.execute(conn.deref_mut())
|
||||
.map_err(|e| anyhow::anyhow!("Update reviewed_only_facts error: {}", e))?;
|
||||
}
|
||||
if let Some(new_allow_corrections) = patch.allow_agent_corrections {
|
||||
diesel::update(personas.filter(user_id.eq(uid)).filter(persona_id.eq(pid)))
|
||||
.set((
|
||||
allow_agent_corrections.eq(new_allow_corrections),
|
||||
updated_at.eq(now),
|
||||
))
|
||||
.execute(conn.deref_mut())
|
||||
.map_err(|e| anyhow::anyhow!("Update allow_agent_corrections error: {}", e))?;
|
||||
}
|
||||
|
||||
personas
|
||||
.filter(user_id.eq(uid))
|
||||
@@ -396,6 +407,7 @@ mod tests {
|
||||
system_prompt: Some("new prompt".into()),
|
||||
include_all_memories: None,
|
||||
reviewed_only_facts: None,
|
||||
allow_agent_corrections: None,
|
||||
},
|
||||
)
|
||||
.unwrap()
|
||||
@@ -425,6 +437,7 @@ mod tests {
|
||||
system_prompt: None,
|
||||
include_all_memories: Some(true),
|
||||
reviewed_only_facts: None,
|
||||
allow_agent_corrections: None,
|
||||
},
|
||||
)
|
||||
.unwrap()
|
||||
|
||||
Reference in New Issue
Block a user