Proposal: Improved UX with Milestones for Polygon PoS

Proposal: Improved UX with Milestones for Polygon PoS

Table of Contents:

  • Abstract
  • Motivation
  • High-Level Overview
    • Current Finality Time Breakdown (256 Blocks)
    • Proposed Finality Breakdown
    • Milestone Lifecycle
  • Testing

Abstract

When a user interacts with the Polygon PoS chain, they need to know when the transaction is finalized. In the current architecture of Polygon PoS, a user needs to wait for a few minutes to make sure that their transaction is included in the canonical chain, which is arguably long. To overcome this issue, we introduce the concept of deterministic finality on the Polygon PoS chain. With deterministic finality, a user can be certain that a finalized block will always be in the canonical chain and, hence, all the block’s transactions are final. This means that the order and inclusion of these transactions cannot be altered under reasonable assumptions.

Motivation

The driving reason behind this proposed change is primarily to improve user experience. The Polygon PoS chain used to face issues with reorgs that were sometimes deep and didn’t account for the finality of the chain. After the Delhi fork, a decrease in sprint length has reduced the case of deep reorgs to a very large extent as a single producer will produce fewer blocks consecutively. In this proposal, we propose the addition of deterministic finality, with the implementation of milestones.This will improve the user experience on the Polygon chain.

High-Level Overview

Milestones are a set of continuous bor blocks that are considered final in the canonical chain.

After every milestone length (min 12 blocks), Heimdall would finalize the end block of every milestone on the Bor chain via Tendermint consensus on the Heimdall layer. Bor nodes would regularly fetch this milestone from Heimdall and make sure that their local chain is matching the received milestone.

If the fetched milestone matches with the local chain then bor node would whitelist that milestone in the local datastore. If it doesn’t match the local chain, the Bor node would rewind to the last whitelisted milestone stored locally.

This process would ensure that Bor nodes do not follow the wrong fork for a lot of blocks and, as a result, reach finality much sooner than the current implementation. With the implementation of Milestones, we also introduce a finalized tag on Bor, similar to Ethereum, which gives the latest finalized block in Bor chain.

Current Finality Time Breakdown (256 Blocks)

Block mining in Bor 576 sec (256*2.25(avg bor block time))
Total time 576

Proposed Finality Time Breakdown

Block mining in Bor 27 sec (12 (sec milestone length) * 2.25 (avg. bor block time)
Bor block confirmation 36 sec (16 block confirmations * 2.25 (avg. bor block time)
Milestone proposal in Heimdall 8 sec (heimdall block time)
Passing of Milestone proposal 24 sec (3 heimdall blocks)
Milestone fetch interval in bor 10 sec (timer interval in bor)
Total (Average case scenario) 105 sec

Milestones reduced total time to finality by 471 seconds.

Milestone lifecycle

The below image represents the lifecycle of a milestone and interactions between the bor and heimdall client during the process:

Testing

For simulating the network, we created multiple devnets of different validator node counts (3, 5, 7 etc.). Below is a sample of finality block confirmations, obtained on a 7-node network:

Currently, dapps are using high block numbers as the limit for bor chain confirmations to make sure that the bor chain no longer reorgs and the transactions and blocks are considered finalized. Based on our findings from the tests performed on devnets, after the introduction of milestones, we have seen much faster finality (~5 to 6 times faster) being achieved on the network. Ranging from 33 blocks to 48 blocks in different scenarios (network delays taken into account), resulting in a better user experience.

7 Likes

I love it! Feels like we’re getting the best of both worlds with this and really putting that heimdall layer to good use.

A few quick questions stemming from my lack of familiarity with tendermint:

  1. Is the bor/heimdall code change for this public? I’d love to take a look.
  2. Will validators in the current span vote on the milestone block, or will all validators?
  3. What happens if half of the validators are following the primary and the other half are following the first backup (so neither would have 66%, which from what I understand is necessary for consensus in tendermint) and everyones’ bor node gets locked so nobody can switch to the other fork? From what I understand of tendermint, the consensus layer would get stuck in the propose-prevote-precommit loop but the proposal wouldn’t change b/c bor wouldn’t change
  4. Would lack of 66% consensus in heimdall mean everyones’ bor gets rewound back to the start block to replay bor’s block construction again, or would bor hang until heimdall reaches consensus?
  5. Would this change impact the current “slashing not implemented” approach towards staking?

The potential for deadlock is honestly the only downside I can think of. If you guys have a solution that explicitly precludes deadlocks from occurring then I’m all in on this pip :heart_eyes:

2 Likes

Hi Thogard!
Thanks a lot for asking out your doubts.
I would like to answer your questions in a sequence they are asked.

  1. Yes, We have “vaibhav/Milestone” PR in Bor and Heimdall. You can look into them and raise the comments for any doubt. Would love to answer your queries at the earliest.
    Bor PR- Vaibhav/milestone [DO NOT MERGE] by VAIBHAVJINDAL3012 ¡ Pull Request #654 ¡ maticnetwork/bor ¡ GitHub
    Heimdall PR -Vaibhav/milestone [DO NOT MERGE] by VAIBHAVJINDAL3012 ¡ Pull Request #939 ¡ maticnetwork/heimdall ¡ GitHub

  2. All the validators will vote on the Milestone Block.

  3. Let’s take a case of 50% (by stake) nodes are on Fork ‘A’ and other 50% are on Fork ‘B’. And now milestone proposer proposes a Milestone from fork ‘A’ and 50% nodes will vote ‘YES’ on it and lock their Bor client and store the Milestone’s ID. With only getting 50% vote, Milestone will fail and we will store its ID in the failed milestone list in Heimdall. Bor client will continuously query the failed milestone list, if they find that ID ,for which they have locked their Bor, exist in the failed milestone list, they will open the lock. This will prevent the deadlock.
    Tendermint consensus layer wouldn’t get stuck in any case. As tendermint only provides the voting and finality platform to us. It don’t have any link with Bor.

  4. Lack of consensus will not rewind back the node. It will only rewind when it finds that the fetched Milestone( successfully voted by 2/3+1 majority in Heimdall) doesn’t match with its local Bor Chain.

  5. As for now, it will not impact the current “slashing not implemented” approach towards staking.

Would love to get more feedbacks from you.:blush:

1 Like

Thank you for the fast response. Just to confirm my understanding: if there is a 50/50 fork then the fork will be settled in bor via the existing bor consensus mechanisms. Heimdall will not reach consensus until 2/3 of Bor clients agree on a block hash. In other words, when there isn’t a fork then we will have much faster finality, but outlier scenarios that delay finality will still be possible (but identifiable).

Is that understanding correct?

This proposal seems promising and could lead to significant improvements in the user experience on the Polygon PoS chain. I would like the reduced time required for a transaction to be considered final, it for sure improves UX, testing results indicate it.

Yes, you are right. But based on our analysis of past data, there were very less instances when two forks were there in the network.

1 Like

Sounds like a win/win proposal to me - I see lots of upside and no downside. Great work guys.

2 Likes

Much appreciated, love the way you explain everything especially the motivation part, I really need it for my work.

1 Like

This is an excellent motivation. The content is well-researched and presented in a clear and engaging manner. I also appreciate the inclusion of High-Level Overview, which led me to discover other useful points.

2 Likes

The proposed finality time breakdown is well-presented and shows a significant reduction in finality time compared to the current implementation. The explanation of each component makes it easy to follow the reasoning behind the calculations.