Snapshots Update: Incremental Snapshots Changelogs

Jackson · May 10, 2023, 6:33pm

Polygon Labs is thrilled to introduce the latest software improvement: incremental chaindata snapshots!

This enables faster and more reliable syncing for Polygon nodes.
With this release:

Monolithic .tar.zst chaindata download files are deprecated and have been moved to an incremental snapshot approach, which allows for more efficient downloads and reduces the risk of data corruption.
Daily incremental snapshots are taken for both bor/heimdall and mainnet/mumbai, ensuring that all snapshots for bootstrapping new Polygon PoSV1 mumbai/mainnet nodes are never more than 24 hours stale.
To prevent data growth explosion, nodes are pruned monthly, which serves as the initial bulk increment. After that, incremental snapshots of un-pruned data are appended on a daily basis, creating a repeating cycle.
Users now download a ‘compiled-files.txt’ file that includes all incremental snapshot download URLs and MD5 checksums.
With the help of the cli-tool aria2c, users can pass this file as their input for downloads, and the checksum verification occurs automatically.
Bootstrapping and fully syncing Polygon nodes should now take only an order of hours!

To take advantage of this new system, check out our wiki for full documentation on how to use it: https://wiki.polygon.technology/docs/operate/snapshot-instructions-heimdall-bor/

Many thanks,

Polygon Labs

ian · August 14, 2023, 6:35pm

The snapshots are still effectively monolithic tar.zst. Look at bor, for example, it has a huge 2tb+ bulk file that you then added baby daily increments to. but the bulk increments are then required to be rebundled into a single huge tar.zst, and then have small baby increments from there. Like what’s the point, might as well keep the whole thing as one tar.zst, as it’s still impossible to download and extract on the same 4TB hard drive. When I saw the bulk file had been split into increments I was so excited, I thought someone heard my prayers, until it came time to repackage them into a single tar.zst. Couldn’t even get the pieces back into the bulk tar.zst without running out of room, because it doesn’t even cat them and remove them as it goes, it cats them allllll into the tar before deleting any of them. Even then it can’t be extracted on the same 4tb disk, even after moving the other increments off the disk. Only option I have now is to rsync from another region which will take forever.