The Coherence Score: How Invariant Measures Agent World State
Under the hood of the Φ formula — turning a graph of claims, constraints, and dependencies into a number that tells you whether your agent's worldview can be trusted.
Every time your agent proposes an action, Invariant computes a risk score and decides whether to allow it. That decision comes from the coherence score — a single number, Φ, that represents the structural integrity of the agent's current world model.
This post explains exactly how it's computed, what each dimension means, and how to tune it for your use case.
The Formula
The coherence score is a weighted sum over five dimensions of world state quality:
Each V term measures a specific type of structural health. Each λ is a configurable weight. Higher Φ means a more coherent, trustworthy world state. Lower Φ means the agent is operating on shakier ground.
The Five Dimensions
| Dimension | Variable | What it measures |
| Constraint violations | Vc | How many active constraints are currently violated |
| Claim staleness | Vk | How stale the agent's beliefs are, weighted by importance |
| Dependency integrity | Vd | Whether entity relationships are still structurally valid |
| Uncertainty exposure | Vu | How many unresolved contradictions the agent is sitting on |
| Branch divergence | Vb | How far the current branch has diverged from the canonical state |
Vc — Constraint Violations
Constraints are rules that must hold across entities at all times — for example, "no two deployments can be active for the same service simultaneously" or "a file cannot be both locked and modified." Vc decreases as active constraints are violated. A single critical constraint violation can drop the score significantly.
Vk — Claim Staleness
Every claim in the world state has a timestamp and a decay rate. Staleness is computed as:
Where τ_k is the time the claim was last verified and λ_s is the decay constant (configurable per claim type). A freshly-verified claim has near-zero staleness. A claim that hasn't been touched in an hour has high staleness and drags down Vk.
Vd — Dependency Integrity
Entities in the world state are connected by typed dependency edges: SUPPORTS, REQUIRES, INVALIDATES, PRODUCES, DEPENDS_ON. When an entity is modified or deleted, Invariant traverses the dependency graph and flags edges that are now structurally broken. Vd reflects how many of these broken edges exist.
Vu — Uncertainty Exposure
When two claims about the same entity contradict each other, Invariant opens a contradiction record rather than silently preferring one. Vu measures the agent's exposure to unresolved contradictions — how many exist and how central the affected entities are to the current action.
Vb — Branch Divergence
When a contradiction can't be resolved, the world state forks into branches representing alternative coherent realities. Vb measures how far the current working branch has diverged from the canonical trunk — a proxy for how speculative the agent's current assumptions are.
From Φ to Action Risk
Φ is a world-state score. When an agent proposes an action, Invariant computes an action risk score that combines the current Φ with the specific impact of the proposed action:
Where each r term captures a specific risk dimension of the action itself:
- rc — constraint violation risk (does this action break any active rule?)
- rd — dependency breakage risk (does this action sever a required edge?)
- ru — contradiction amplification (does this action make existing contradictions worse?)
- re — uncertainty exposure (is the agent acting on a claim it isn't sure about?)
- rp — provenance fragility (is the claim this action depends on from a low-trust source?)
- rg — propagated risk (graph traversal — how many downstream entities does this touch?)
If R(a) > ε (your configured action budget threshold), the action is blocked. If R(a) ≤ ε, it proceeds and the world state is updated.
Default Weights and Tuning
Out of the box, Invariant ships with sensible defaults that work well for most agent use cases:
# Coherence score weights (Phi)
COHERENCE_LAMBDA_C=1.0 # constraint violations (highest weight)
COHERENCE_LAMBDA_K=1.5 # staleness (highest — stale beliefs are dangerous)
COHERENCE_LAMBDA_D=0.8 # dependency integrity
COHERENCE_LAMBDA_U=0.5 # uncertainty exposure
COHERENCE_LAMBDA_B=0.7 # branch divergence
# Action risk threshold
ACTION_EPSILON=0.6 # block if R(a) > 0.6
ACTION_BUDGET=5.0 # total cumulative risk budget per session
For high-stakes agents (infra, finance, healthcare), lower ACTION_EPSILON to 0.3–0.4 and increase COHERENCE_LAMBDA_C. For exploratory agents that need more latitude, raise ACTION_EPSILON and lower COHERENCE_LAMBDA_K.
Reading the Score in Practice
The coherence score is returned with every action validation response:
{
"status": "allowed",
"riskScore": 0.23,
"coherenceScore": 0.81,
"worldState": {
"activeConstraints": 3,
"violations": 0,
"contradictions": 1,
"staleClaims": 2,
"branchDepth": 0
}
}
A coherence score above 0.8 is healthy. 0.6–0.8 means the agent is operating on somewhat stale or uncertain state — worth monitoring. Below 0.6, consider pausing execution and re-grounding the agent's world state before proceeding.
Why a Score, Not Just Blocks?
Hard blocks are necessary but not sufficient. An agent running at coherence 0.62 with no blocked actions is still an agent operating on thin ice — the blocks just haven't triggered yet.
The coherence score gives you a continuous signal you can log, alert on, and feed back to the agent as context. Some teams inject the current Φ directly into the agent's system prompt: "Your current world-state coherence is 0.71. Proceed carefully and re-verify stale claims before modifying shared resources."
That feedback loop — model aware of its own epistemic state — is where the real reliability gains come from.
What's Next
We're working on per-entity coherence scores, so you can identify specifically which part of the world state is degraded rather than just the aggregate. We're also building a visual coherence timeline in the dashboard — watching Φ drop in real time as an agent run progresses is surprisingly useful for debugging.
If you want to dig into the math further, see the full API docs or reach out directly.
See the coherence score in your agent
Free tier available. Integrates in under 10 minutes.