Architecture Deep-Dive

Causal consistency and vector clocks: why agents read current, not stale, state

David Faith 2026-06-227 min read

An agent reads stale state when it observes a value without observing the updates that caused it — a read that lands before the writes it depends on. Causal consistency forbids exactly that: it guarantees that if one update happened-before another, every replica applies them in that order, so a reader never sees an effect without its cause. Systems track this with logical clocks — Lamport timestamps for a total order, vector clocks when you must also detect true concurrency — and repair lagging replicas by anti-entropy. Reading at a causally-consistent cut is the formal version of "read current reality, not a snapshot": the snapshot is allowed to be old, but it can never be incoherent.

The problem is causality, not freshness

The intuitive fear about an agent is that it reads old data. The precise, dangerous failure is narrower: it reads data that is incoherent — a value without the updates that produced it. Suppose agent A records a correction, and that correction is the reason agent B should change course. If B reads the new course but not the correction that justifies it — or worse, reads the correction’s effect on one field but the pre-correction value on another — it acts on a state that never actually existed on any single replica at any single moment. Staleness is tolerable; that kind of torn read is not.

Distributed systems make this tractable by reasoning about causality directly rather than about clock time, because wall clocks skew and reorder. The foundational tool is Lamport’s happens-before relation.

Happens-before: a partial order over events

Lamport defined happens-before () with three rules: if a and b are events on the same node and a comes first, then a → b; if a is the sending of a message and b is its receipt, then a → b; and is transitive. Two events where neither a → b nor b → a holds are concurrent — they have no causal relationship, and any replica is free to order them however it likes.

This is a partial order, and that partialness is the whole point. The events that genuinely depend on each other are ordered; the events that don’t are left unordered, and forcing an order on them would mean coordination you can’t afford.

Lamport timestamps vs vector clocks

A Lamport timestamp assigns each event a single integer: increment a local counter on every event, and on receiving a message set the counter to max(local, received) + 1. This guarantees a → b ⟹ L(a) < L(b), giving a total order you can sort by. What it cannot do is detect concurrency — L(a) < L(b) might mean a caused b, or might mean they’re concurrent and a simply happened to get the lower number.

When you need to tell those apart — and for concurrency detection you do — you use a vector clock: a per-node vector of counters. Node i increments its own slot on each event and, on receipt, takes the element-wise max then increments its own slot. Comparison is then exact:

That last case is the one that matters for shared memory: it’s how a replica recognizes two writes that were made without knowledge of each other, so it can preserve both rather than silently overwrite. This is the same machinery the CRDT deep-dive relies on to merge concurrent writes deterministically.

Causal consistency as a consistency model

A causally consistent store guarantees that operations related by happens-before are observed by everyone in that order; concurrent operations may be observed in any order, possibly differently on different replicas. Place it on the spectrum:

Causal consistency is the sweet spot for local-first agent memory: it lets every device keep writing while offline, yet forbids a reader from ever seeing an effect before its cause.

Session guarantees: causality from one client’s view

The global model has a per-client refinement. Session guarantees carry a client’s causal context with each request so its own experience stays coherent. Read-your-writes means an agent that just wrote a fact can always read it back. Monotonic reads means once an agent has seen a value, no later read returns something older — time only moves forward for that session. The implementation is the same vector context: a read is served only by a replica at least as advanced as everything the session has already observed.

Anti-entropy: how laggards catch up

Tracking causality tells you what order to apply updates; anti-entropy is the background repair that actually propagates them. Replicas periodically compare summaries of what they hold (version vectors are a natural fit) and exchange the missing updates until they match. It’s the gossip-style mechanism that turns “eventually delivered” into “delivered” without anyone blocking on it — see the peer-to-peer sync deep-dive for how this runs across devices.

How HiveMind applies it

HiveMind keeps a full copy of the shared corpus on each of your machines and syncs them peer-to-peer, with anti-entropy reconciling replicas in the background; your data never leaves your devices. Because the record is append-mostly — corrections are new facts, not destructive edits — causal order is what lets a correction reliably supersede the fact it corrects: the correction happened-after the original, every replica respects that, and no agent reads the old value once it has seen the new one. That is the formal version of the stale-knowledge cure: an agent reads at a causally-consistent cut, so its picture may lag the very latest write but can never be internally incoherent.

Causal order is the floor, not the ceiling. It guarantees agents agree on what came after what; deciding which of two concurrent claims to trust is a separate layer, built on corroboration and earned confidence — no agent certifies its own truth.

Frequently asked

What's the difference between a Lamport timestamp and a vector clock?

A Lamport timestamp is a single integer per event that gives you a total order consistent with happens-before: if a happened-before b, then L(a) < L(b). But the converse fails — L(a) < L(b) does not tell you whether a caused b or the two are concurrent. A vector clock keeps one counter per node, so comparing two vectors distinguishes three cases precisely: one happened-before the other, the reverse, or neither (concurrent). You pay O(N) space per timestamp for the ability to detect concurrency.

Is causal consistency the same as strong consistency?

No. Strong consistency (linearizability) makes the system behave as if there were one copy and every operation took effect at a single instant — which requires coordination on the write path and so costs availability under partition. Causal consistency only orders operations that are actually causally related; concurrent operations may be seen in different orders on different replicas. That weaker promise is achievable while staying available and partition-tolerant, which is why it is the strongest model compatible with an always-writable, local-first system.

What do session guarantees add on top of causal consistency?

Session guarantees pin the model to a single client's experience over time. Read-your-writes ensures you can always read an update you just made; monotonic reads ensures that once you've seen a value, you never later see an older one; monotonic writes and writes-follow-reads order your own operations sensibly. They're implemented by carrying the session's causal context (a vector or set of versions) with each request, so a read is only served by a replica that is at least as current as everything the session has already observed.

Related

Take yourself out of the loop.

Let your agents do the lifting while you keep the judgment.

Get the Playbook