Architecture Deep-Dive

The peer-to-peer sync protocol: gossip, anti-entropy, and delta exchange

David Faith• 2026-06-22•7 min read

A peer-to-peer sync protocol lets replicas with no central server converge by exchanging only what each one is missing. Peers gossip updates epidemically — each node periodically picks a peer and trades news — while a background anti-entropy pass reconciles any divergence the gossip missed. To avoid shipping whole copies, peers compare a Merkle index to locate the exact ranges that differ, then send only those deltas. HiveMind uses this to keep a full local copy on every device in sync over a private mesh, so the shared state is visible everywhere without anything leaving your machines.

The problem: convergence with nobody in charge

When every device holds a full copy of the shared memory and any of them can accept writes, you need a way for those copies to agree without a central server to funnel through. The naive approach — periodically have each peer send its whole corpus to every other peer — converges, but it costs O(state) bandwidth per round and gets worse the more history you accumulate. The real protocol design problem is: how do two replicas converge while transmitting only what one of them is missing?

The answer comes in two cooperating layers. Gossip spreads new writes quickly and cheaply. Anti-entropy runs underneath as a completeness guarantee, periodically proving two peers hold the same data and repairing them when they don’t. Neither needs a coordinator, and both tolerate the network being unreliable.

Gossip: epidemic dissemination

Gossip protocols borrow their math from epidemiology. When a node has news — a new record — it doesn’t broadcast to everyone. It picks a small random set of peers each round and tells them; they tell their peers; and so on. An update reaches the whole network in a number of rounds that grows only logarithmically with the number of peers, and it does so with no single node bearing the fan-out cost. This is epidemic dissemination, and it is robust precisely because it is redundant: there is no critical path, so losing any node just means the news takes another route.

Gossip comes in three modes, distinguished by who initiates and what flows:

Push — a node with new data offers it to a peer. Fast at spreading a fresh write, but wasteful late in a round when most peers already have it.
Pull — a node asks a peer “what’s new?” Efficient at finishing dissemination, when few updates remain.
Push-pull — both directions in one exchange. This is the usual default because it combines the early speed of push with the tail efficiency of pull.

Anti-entropy: proving and repairing divergence

Gossip is best-effort. Messages drop, nodes go offline, partitions form — so some replicas will silently fall behind. Anti-entropy is the reconciliation pass that closes the gap: two peers compare summaries of their entire state and exchange whatever differs. Run often enough, it guarantees eventual consistency no matter what the gossip layer lost.

The expensive way to do anti-entropy is to compare full state. The cheap way is to compare a Merkle index. Each peer maintains a tree where leaves hash individual records (or small ranges) and every parent hashes its children. Two peers reconcile by comparing root hashes: if the roots match, the entire datasets are identical and nothing needs to be sent. If they differ, the peers descend, comparing child hashes and pruning every subtree whose hash already agrees. They only ever transmit the leaves under the branches that actually diverge. This Merkle reconciliation turns full-corpus comparison into a logarithmic set reconciliation that discovers the diff and ships only the missing records. HiveMind uses exactly this — a Merkle index to find differences efficiently — so a sync moves deltas, not the whole corpus. (The same tree structure underpins tamper-evident history; see the Merkle DAG provenance deep-dive.)

The payload: delta-state CRDTs

Finding the diff is half the job; the other half is making the exchange safe to apply in any order. The records on the wire are delta-state CRDTs — small fragments of conflict-free replicated state. Because CRDT merge is commutative, associative, and idempotent, a peer can apply incoming deltas regardless of arrival order, duplication, or how many partitions they crossed to get there. That is what lets gossip be lossy and anti-entropy be retried freely: re-delivering a delta is harmless. HiveMind’s record is append-mostly, which keeps the merge a clean monotone union and makes the Merkle ranges stable. The full treatment is in the CRDTs for agent state deep-dive.

Churn, partitions, and the transport

Devices join, leave, and reconnect constantly — that’s churn — and the protocol is built to not care. No write waits on any specific peer, so a partition just means each side keeps appending and gossiping locally; when the link returns, one anti-entropy round compares Merkle roots and heals the split in both directions. The peers reach each other over a private encrypted mesh rather than a public relay (see the WireGuard mesh and NAT traversal deep-dive), which is what keeps P2P replication working across home networks without your data ever transiting a server.

Why sync is what makes delegation observable

The plain-English twin of this piece is simple: you can’t delegate to what you can’t see. Sync is the mechanism that makes shared state visible everywhere. Because anti-entropy proves what each replica holds, you can ask any device what it knows and trust the answer — convergence isn’t hoped for, it’s demonstrated by a matching Merkle root. An agent acting on one machine writes into a memory that provably reaches every other machine, and you can look in from any of them. That is the difference between handing work to a black box and delegating to something you can observe.

Prefer plain English? Read the plain-English version — You can't delegate to what you can't see ›

Frequently asked

What is the difference between gossip and anti-entropy?

Gossip (epidemic dissemination) is the fast path: when a node learns something new it pushes that update to a few random peers, who push it onward, so news spreads exponentially. Anti-entropy is the slow, complete path: two peers periodically compare their full state summaries and reconcile any differences, catching anything a gossip message dropped. Most real systems run both — gossip for low latency, anti-entropy as the safety net that guarantees eventual consistency even after partitions and lost messages.

How do peers find the diff without sending everything?

They compare a Merkle tree of their data. Because a parent hash changes only if some child changed, two peers can walk their trees top-down and prune any subtree whose root hash already matches — that subtree is identical and never gets transmitted. They descend only into the ranges that differ, which turns a full-corpus comparison into a logarithmic set-reconciliation that ships just the missing records.

What happens during a network partition?

Nothing breaks, because no peer depends on any other to accept writes. Each side keeps appending locally, gossiping among whatever peers it can still reach. When the partition heals, the next anti-entropy pass compares Merkle roots, discovers the divergence, and exchanges the deltas in both directions. Because the payloads are CRDTs, the merge is order-independent, so both sides reach the identical state regardless of who reconnects first.

Take yourself out of the loop.

Let your agents do the lifting while you keep the judgment.

Get the Playbook