PIP-64: Validator-Elected Block Producer

Hello!

This is a great PIP in general and I’m all for pushing boundaries. I have some concerns around witnesses and their overhead costs. I have made some assumptions on what these proof will entail, if this premise is wrong, then no need to read rest of the doc.

  • Trie Node Proofs: Cryptographic proofs of state trie portions accessed during execution
  • Account Data: State information for accounts accessed during block execution
  • Storage Proofs: Evidence of storage slot values for contract interactions
  • Code Witnesses: Contract bytecode accessed during execution
  • Proof Metadata: Information to organize and verify witness components

Concerns:

  • The exact composition of witnesses is not explicitly defined, can we get a breakdown of what these witnesses will contain?
  • Different implementations could significantly vary in size and complexity
  • Witness format standardisation would be required for interoperability

The write up assumes witnesses would rely on:

  • Merkle Patricia Tries: Current standard in Ethereum/Polygon, with proofs consisting of sibling nodes along paths from leaf to root
  • Hash Functions: Keccak-256 for consistency with existing EVM infrastructure
  • Potential Future Upgrades: Verkle tries or other vector commitment schemes

Witness Generation Process

The assumed witness generation process includes:

  1. State Access Tracking: VEBloP tracks all state accesses during execution
  2. Trie Node Collection: Relevant trie nodes are collected for each access
  3. Proof Construction: Cryptographic proofs are constructed for each state element
  4. Witness Compilation: All proofs are compiled into a structured format
  5. Witness Propagation: Distribution alongside blocks via dedicated protocol

Concerns:

  • State access tracking adds computational overhead to block production
  • Proof construction is computationally intensive and may become a bottleneck
  • The process assumes deterministic execution paths, which may not hold for all EVM operations
  • Witness generation must be perfectly synchronized with execution to avoid inconsistencies

Caching and Transaction Pattern Concerns

Caching Effectiveness

The analysis assumes witness caching could reduce witness size by ~40-60%, but this depends heavily on transaction patterns:

Concerns:

  • Temporal Locality Assumption: Caching assumes transactions access recently used state
  • Contract Popularity Skew: Effectiveness varies based on contract usage distribution
  • Changing Patterns: DeFi trends, NFT mints, or other activity spikes can invalidate caching assumptions
  • Cache Invalidation: State changes require careful cache invalidation strategies
  • Cache Size Limits: Memory constraints limit cache effectiveness for validators

Transaction Type Distribution

The analysis assumes a transaction mix of 70% simple transfers, 25% token transfers, and 5% complex DeFi:

Concerns:

  • Evolving Usage Patterns: This distribution may not hold over time
  • DeFi Complexity Growth: Increasing complexity of DeFi protocols may increase witness sizes
  • New Contract Types: Novel contract patterns may have unpredictable witness requirements
  • MEV Transactions: MEV-related transactions often access many storage slots, generating larger witnesses

State Access Patterns

Concerns:

  • Hot Spots: Certain state elements (popular tokens, liquidity pools) may be accessed frequently
  • State Growth: As state grows, witness sizes for accessing “cold” state increase
  • Cross-Contract Interactions: Complex transactions touching multiple contracts generate larger witnesses
  • Storage Layout: Contract storage layout significantly impacts witness size

Network and Propagation Concerns

Bandwidth Requirements

The analysis estimates 200-500 Mbps bandwidth requirements for unoptimized witnesses:

Concerns:

  • Geographic Disparities: Validators in regions with limited bandwidth may be disadvantaged
  • Network Congestion: Peak network usage may cause propagation delays
  • ISP Limitations: Data caps or throttling could affect validator participation
  • Cost Barriers: High bandwidth requirements increase operational costs

Propagation Strategies

Concerns:

  • Gossip Protocol Efficiency: Standard gossip protocols may be inefficient for large witnesses
  • Network Topology: Suboptimal network topology could increase propagation times
  • Propagation Failures: Partial witness propagation could lead to validation failures
  • Prioritization: Lack of witness part prioritization could delay critical validation

Computational Concerns

Generation Overhead

Concerns:

  • CPU Intensity: Witness generation adds ~50-75% overhead to block production
  • Memory Usage: Tracking state access requires significant memory
  • Parallelization Challenges: Some aspects of witness generation are difficult to parallelize
  • Hardware Requirements: May necessitate specialized hardware for VEBloPs

Verification Costs

Concerns:

  • Verification Time Variability: Complex transactions require more verification time
  • Proof Verification Overhead: Cryptographic verification adds significant CPU load
  • Batching Inefficiencies: Verification may not scale linearly with transaction count
  • Resource Competition: Verification competes with other validator tasks

Consensus Impact Concerns

Finality Time

Concerns:

  • Witness Propagation Delay: Large witnesses add ~400-600ms to propagation time
  • Verification Bottlenecks: Witness verification may become the critical path for finality
  • Timeout Parameters: Consensus timeouts may need adjustment for witness overhead
  • Liveness Risk: Excessive delays could trigger unnecessary view changes/leader rotations

Security Implications

Concerns:

  • Validator Participation: Bandwidth/computational requirements may reduce validator set
  • Centralization Pressure: Only well-resourced entities may be able to serve as VEBloPs
  • New Attack Vectors: Witness-specific DoS attacks become possible
  • Censorship Resistance: Witness overhead could weaken forced transaction mechanisms

Possible Alternative Approach

ZK-Based State Validation

  • Replace explicit witnesses with zero-knowledge proofs of state access
  • VEBloP generates ZK proofs that transactions were executed correctly
  • Validators verify compact ZK proofs instead of re-executing with witnesses
  • Leverage and utilise latest Succint breakthrough of SP1 HyperCube block verification (https://x.com/succinctlabs/status/1924845712921264562?s=46)

Benefits:

  • Dramatically reduced witness size
  • Constant-size proofs regardless of transaction complexity
  • Potential privacy benefits

Overall, I love the proposal and we should push ahead with consensus change to bring Polygon PoS in line with latest gen of DLT ecosystems.

1 Like

Considering the potential for a single VEBloP to have significant influence over transaction ordering and MEV extraction within a checkpoint, what mechanisms would be considered (or researched) to ensure fair transaction inclusion and prevent censorship or malicious MEV exploitation by the elected block producer, before more sophisticated automated checks might be implemented?

The following reflects the views of L2BEAT’s governance team, composed of @kaereste, @Sinkas, and @Manugotsuka, and it’s based on their combined research, fact-checking, and ideation.

We asked L2BEAT’s research team to review PIP-64 and the accompanying PIP-65 for the VEBloP economic model. Their findings informed the comment below.

In summary, we understand the proposed changes to be the following:

  • We’ll be moving away from the current ~105 block producing validators, centralizing block proposals to a few validator-elected proposers (VEBloP).
  • The existing ~105 validators will validate all blocks ‘statelessly’, which keeps their resource requirements low but requires more input data than just the transactions (state fragments called witness or trace). They also elect and churn the VEBloPs.
  • Forced transactions are introduced to detect censorship and enforce censorship resistance against the new centralized block proposers. They will be forced from Ethereum L1
  • To keep validator incentives, transaction fees are redirected from the centralized block producers to the ~105 validators that do stateless block validation.

Gas limits and potential transactions per second (TPS) will increase based on the above changes. But there are a few things we’re unclear on, which we’d like to see clarified.

Centralizing block proposals will lead to more potential for MEV. Even as things are right now, MEV is not solved or addressed in protocol, and this proposal doesn’t offer any anti-MEV measures either, except manually rotating out VEBloPs by validators.

Given that, validators will need to monitor for malicious block proposers and especially MEV. How will that be done in practice? Is there any tooling available for the validators to use, assuming that validators cannot run a full node?

The forced transaction mechanism is interesting and would also benefit from more details or specifications.

Lastly, stateless execution / validation is not yet in production anywhere in the Ethereum ecosystem. There’s no information available on how the witnesses would be generated, distributed and validated apart from the quoted research from outside the Polygon ecosystem.

3 Likes

Hi all. Arash here from the Polygon Labs product team.

PIP-64 defines a significant architectural change for the Polygon PoS chain. As safety and liveness are the paramount concern of most users of the chain, we recommend taking a cautionary approach with this upgrade. This would entail:

  • Starting with a small, controlled set of block producers, run and maintained by Polygon Labs. This is for 3 reasons:
  1. This small set allows for validation of the span rotation mechanisms and allowance for redundancy, whilst keeping the numbers small to ensure ease of maintenance and management.
  2. Although extensive models and estimates have been made for the infrastructure costs related to running this new type of node (block producer capable of creating larger blocks in less time), the reality of running this in a production environment is unknown. Monitoring the usage demands of these nodes and costs implications will allow other validators to make informed decisions when deciding to participate in block production or not.
  3. With such a significant change, allowing for immediate and quick iterations is much more achievable within a single team.
  • As the architectural change stabilises, and network performance increases over the months following the release, validator operators can opt in to run block producers, having the information on the costing and maintenance requirements.

  • The change in architecture also potentially impacts tokenomics. This period will allow for monitoring and further refinement of any changes on how transaction fees, and MEV operates on the chain.

4 Likes