Maintenance
optimize and cleanup. Fragment compaction and version GC.
Graphs accumulate two kinds of overhead over time: small Lance fragments from frequent writes, and old manifest versions that pin obsolete data. The two maintenance commands address each:
omnigraph optimize. Non-destructive fragment compaction.omnigraph cleanup. Destructive removal of old Lance versions.
When to run
| Symptom | Command |
|---|---|
| Read latency rising on a write-heavy graph | optimize |
| Storage size growing faster than logical data | cleanup after optimize |
| Snapshot-id reads slower than expected | optimize (consolidates the per-snapshot table reads) |
| Backup window exceeds budget | cleanup --keep N --older-than D --confirm |
A common cadence is nightly optimize plus weekly cleanup --keep 10 --older-than 7d --confirm.
How they relate
optimize rewrites small fragments into fewer large ones and commits a new
manifest, but the old fragments remain reachable via earlier manifests.
That preserves the ability to read historical snapshots. And means
optimize alone doesn't free disk space.
cleanup removes manifests (and the fragments unique to those manifests)
older than the retention policy. After cleanup runs, the snapshots backed
by those removed manifests are no longer readable.
Run them as a pair: optimize first to consolidate, cleanup after to
reclaim.
Recovery floor
The open-time recovery sweep can roll a branch back to the manifest-pinned
version of an affected table. In the typical Phase B → Phase C drift case
that target is HEAD-1, and occasionally a version or two further back.
--keep < 3 is unsafe. It can garbage-collect a version the recovery
sweep needs. The shipped default --keep 10 is safe; ratchet down only
with deliberate awareness.
Concurrency
Maintenance commands parallelize across tables. The concurrency ceiling is
set by OMNIGRAPH_MAINTENANCE_CONCURRENCY (default 8). Tune down on
shared object stores if maintenance impacts foreground latency; tune up on
dedicated hosts to compress maintenance windows.
Internal schema migrations
The on-disk __manifest shape evolves over time. When a newer binary opens
a graph stamped at an older internal schema version, the publisher reconciles
the shape automatically on the first write. Reads are side-effect-free.
A binary that opens a manifest stamped at a higher version than it knows about refuses to publish with a clear "upgrade omnigraph first" error. Old binaries cannot clobber a newer schema. There is no operator action required for in-place upgrades; just don't pin Omnigraph backwards across versions that bumped the internal schema.