On April 08, Polygon zkEVM Mainnet Beta was affected by an L1 reorg that resulted in a state synchronization issue on the network. Resolution required recomputation of the network’s state. A new version of the sequencer (v0.6.5-RC6), released shortly after, included an additional protection to prevent this issue from recurring in the short term. An additional, independent protection was added on April 09.
In resolving the issue, the network was down for twenty minutes. Polygon PoS, chains built with Polygon CDK, and chains connected to the AggLayer were unaffected.
For dApps on zkEVM: Transactions made on April 08, between approximately 3:00 UTC and 7:30 UTC, had to be re-executed. These transactions may have been processed in a different block or may not have been processed—please get in touch below, if this is the case. Approximately 1,000 transactions may have been affected.
Description of the Incident
On April 08, the sequencer for Polygon zkEVM Mainnet Beta detected a reorg on the underlying L1, at block 19608115.
Because of the network’s recent state sync issue, a protection was already included in the sequencer. This protection worked as expected, but the network’s synchronizer did not handle the reorg properly. Instead of just updating a record, in this case it inserted a new record. The sequencer then inserted a new Global Exit Root at batch 2005906. While this Global Exit Root was correct, it took an index that it was one greater than what it should be.
When this batch was inserted in L1, the selected index still didn’t exist, causing it to be interpreted as invalid.
In summary, the network’s trusted state included this batch as valid while the virtual state considered it invalid. Resolving this required re-execution of the trusted state with this block as invalid in order to build the proof. Consequently, some transactions were not executed and some transactions were re-executed with different results.
Future Mitigation
This is the second state synchronization issue for Polygon zkEVM. The core engineers for zkEVM have since released two independent protections, each of them robust enough to prevent L1 reorgs from causing this state synchronization issue to recur.
- Sequencer v0.6.5-RC6, released April 08, includes a mechanism for detecting when an L1 reorg occurs; once a reorg is detected, the network will continue operating normally, but the sequencer will temporarily stop accepting new Global Exit Roots until confirmation that the network’s state is synchronized.
- An additional protection was added to the synchronizer on April 09. It allows the network to synchronize from older blocks, where older blocks are a config parameter that act as a safe checkpoint.
Timeline of Events
April 08, 2024
02:56 UTC:
The sequencer for Polygon zkEVM Mainnet Beta detected a reorg at block 19608115
- Reorg: https://etherscan.io/block/19608115/f
- The synchronizer failed to properly handle this reorg, inserting a new record instead of updating the existing record.
Approximately 03:00 UTC:
The sequencer inserted a new Global Exit Root at batch 2005906. This Global Exit Root was correct, but improper handling by the synchronizer caused an incorrect index taken by the sequencer.
- When this batch was committed to the L1, it was interpreted as invalid.
- At this point, the network’s trusted state and virtual state were synchronized with different states.
03:36 UTC:
The core engineers for zkEVM received an alert of a state synchronization issue.
6:47 UTC:
The correct state of the network was recomputed at batch 2005966, using the same sequenced batches except for those that may have been invalid.
7:30 UTC:
The network was restarted and resumed normal operation.
- Infrastructure providers were asked to resync permissionless nodes
- Etherscan and the Graph reindexed from L2 block 11425001
7:48 UTC:
Network downtime began
8:06 UTC:
Network downtime ended
09:50 UTC:
The sequencer was updated to v0.6.5-RC6, which includes the first protection mechanism described above. No action was required from infrastructure providers.
The core engineers for zkEVM have since added an additional, independent protection to prevent this from happening again.