Architecture Deep-Dive

Tombstones and log compaction: deletion in an append-only store

David Faith 2026-06-227 min read

In an append-only store you never erase a record in place — so a deletion is itself an appended record: a tombstone that marks a prior entry as removed. The tombstone has to propagate to every replica before the space can be reclaimed, otherwise a node that missed it would resurrect the deleted data on the next sync. Reclaiming space is a separate background job — log compaction or garbage collection — that rewrites the log once a tombstone is provably everywhere. For genuinely sensitive data, true erasure is achieved by crypto-shredding: encrypt each record under its own key and delete the key, so the ciphertext that remains in old segments is unreadable. Most forgetting, though, should be reversible decay, not deletion.

A delete is just another append

In an append-only log the one operation is append. You never overwrite or truncate, because the log is the source of truth and it is replicated to many machines. So how do you remove something? You append a record that says it is removed. That record is a tombstone: a small marker that references a prior entry by id and declares it dead.

This is the same move the observed-remove set makes. A remove is not a destructive edit; it is an event with its own causal position, so it merges deterministically with everything else. Two replicas that have seen both the original write and its tombstone agree that the item is gone. A replica that has only seen the write still shows it — correctly, because as far as that replica knows, the delete hasn’t happened yet.

Why tombstones must propagate before you reclaim

Here is the trap. The obvious optimization is: once we’ve tombstoned something, drop both the original and the tombstone and free the space. In a distributed store that resurrects data.

Imagine node A writes a fact, then tombstones it, then immediately garbage-collects both. Node B has been offline and never saw the tombstone — it only has the original write. On the next sync, B re-sends the write to A, and A, having forgotten it was ever deleted, accepts it as new. The deletion is undone. This is the OR-Set remove problem in the field: a remove only wins if every replica observes the remove. You cannot forget the tombstone until you can prove the tombstone reached everyone who held the data.

So the tombstone has to outlive the thing it deletes, and it has to stay around long enough to dominate every concurrent re-add. Only then is reclamation safe. With each machine holding a full local copy synced peer-to-peer, “everyone” means every replica — so reclamation waits on causal coverage, not a wall clock.

Log compaction and garbage collection

Reclaiming space is a separate background process from deleting. Log compaction rewrites the log into a new, shorter log that preserves the current logical state while dropping entries that are provably superseded — overwritten keys, and tombstoned records whose tombstones have propagated past the safe horizon. Garbage collection is the umbrella term for the bookkeeping that decides what is safe to drop: which entries are still reachable, which tombstones are still load-bearing, and which segments can be discarded entirely.

The key property is that compaction is semantically transparent: replaying the compacted log yields the same current state as replaying the full history, minus the parts everyone has agreed to forget. It bounds storage without changing what the store believes today. Retention policy — how long tombstones and old segments linger — is a tuning knob on this process, not a separate mechanism.

Soft-forget versus hard-delete

Not all forgetting is deletion, and conflating the two is the mistake the decay-vs-deletion split exists to prevent.

Defaulting to decay keeps the corpus rich and recoverable; reserving hard-delete for deliberate acts keeps it safe.

Crypto-shredding and the right to be forgotten

There is a real tension between immutability and the right to be forgotten. If a record is encrypted and replicated to offline laptops and archived segments, physically rewriting every copy on demand is slow at best and impossible at worst. Crypto-shredding resolves it: encrypt each record (or each subject’s data) under its own key, store only ciphertext in the log, and keep the keys in a separate, mutable keystore. To erase the data, you delete the key, not the rows. Every copy of the ciphertext — everywhere, including replicas you cannot reach right now — becomes undecryptable noise the moment the key is gone.

This gives you true erasure that honors a GDPR-style request without violating append-only semantics: the log stays immutable, the tombstone preserves the fact of deletion for provenance, and the content is provably unrecoverable. In HiveMind the corpus is append-mostly and forgetting is deliberate — soft decay by default, with deletion a separate, owner-driven act layered on top of these mechanisms rather than a casual overwrite.

Frequently asked

Why can't an append-only store just delete the record?

Because the log is the source of truth and it is replicated. Mutating or removing bytes in place breaks the guarantees that let many replicas converge — a peer that already synced the old segment would still hold the data, and a peer that hadn't would re-send it. So the delete is modeled as a new appended fact (a tombstone) that every replica can merge deterministically, and the bytes are reclaimed later by compaction once the tombstone has propagated everywhere.

What is crypto-shredding and when do you need it?

Crypto-shredding encrypts each record (or each subject's records) under a distinct key and stores only the ciphertext in the log. To erase the data you destroy the key, not the rows: every copy of the ciphertext, including ones in archived segments or on offline replicas, instantly becomes undecryptable noise. It is the practical way to satisfy a right-to-be-forgotten request against an immutable, widely-replicated store, where physically rewriting every copy is slow or impossible.

Isn't a tombstone still a record of the deleted thing?

A tombstone records that something was removed and references it by id — it does not need to contain the payload. That preserves provenance (you can prove a deletion happened and when) without retaining the sensitive content. When the content itself must be unrecoverable, you combine the tombstone with crypto-shredding or a compaction pass that drops the superseded payload.

Related

Take yourself out of the loop.

Let your agents do the lifting while you keep the judgment.

Get the Playbook