Overview
We’d like to share the idea to build a multi-proof system for Polygon zkEVM based on the 2FA zk-rollups using SGX discussion on Ethereum Research forum, initiated by Justin Drake.
TL;DR: zk-rollups are theoretically solid, but practically error-prone. Because the implementation is quite complex. To hedge the bugs in ZK verifier, a hardware solution, Trusted Execution Environment (TEE), can be used as an 2-factor verifier to offer double security to zk-rollups.
In this proposal, we are going to talk about a design of TEE-based multi-proof system for Polygon zkEVM.
Problem
The problem is well framed in the original post 2FA zk-rollups using SGX. Here we just summarize the key points below:
- Single-proof zk-rollups is error prone in engineering implementation
- There are bugs in rollup verifiers (circuits and ZK verifiers):
- Research: What don’t we know? Understanding Security Vulnerabilities in SNARKs
- It’s hard to detect bugs in verifier
- A vulnerability may put all the assets in the L2 in risk
- So we should introduce multi-proof to remove the single-point-of-failure in ZK verifier
Solution
Ideally, we can have diversified ZK provers and verifiers to generate multi-proof. However, it takes time to build such system, and increase the costs to operate the system. We propose a pragmatic solution, a TEE-based multi-prover for Polygon zkEVM.
The solution is based on the original post 2FA zk-rollups using SGX. Again, here we just summarize the key points below:
- Trusted Execution Environment (TEE)
- A back-box built in some special CPUs. It can execute some code inside, and ensure the memory and the states are temper-proof and confidentiality-preserving. It acts as a blackbox, so that nobody including the hardware owner, can manipulate the execution or steal data from it.
- Intel SGX is the most popular TEE implementation. It can generate “remote attestation”, a report signed by the hardware, to prove that a certain program is running inside a genuine TEE.
- We can build a proving system using TEE
- First, the prover generates a key pair
(pk, sk)
inside the TEE - Then register the TEE prover with a proper remote attestation and store the public key
pk
on L1 - The prover produces the state transition proof
- Prover to commit to
(pre_state_root, post_state_root, block_root)
- Prover to sign the commitment with the key pair
- Verifier contract to check the ecdsa signature of the commitment with the
pk
- Prover to commit to
- First, the prover generates a key pair
- Nice properties:
- Safe
- The TEE prover is purely an addition to ZKP
- In the worse case, it fallbacks to the zk-rollup security.
- Unstoppable
- We can run N provers and require 1-of-N approval on L1
- This can be achieved by giving some incentive to TEE provers
- Low cost
- Very little gas consumption: just an ECDSA verification
- Minimal change to L1 contracts
- Prover is simple: just a EVM simulation in TEE
- Safe
To implement it, we will need to implement: (1) prover registration, (2) redundancy of the provers, and (3) the prover logic.
Register Prover
Provers should be registered on L1 before they can verify TEE proofs. Once the prover is deployed to Intel SGX, it can produce a Remote Attestation Report. The report can be verified to ensure: (1) the prover is running the correct code inside Intel SGX, and (2) a key pair (pk, sk)
was generated by the correct code.
It’s tricky to verify the report in EVM. Because the report is large (around 20kb), and the verification involves JSON parsing and unsupported cryptographic (secp256r1, sha256, etc). To solve that problem, we propose to build the report verifier in ZK.
Our reference implementation in Risc0:
https://github.com/jasl/zk_dcap_verifier_poc
High Availability
As suggested, the multi-proof system should allow many (e.g. 100+) TEE provers to be registered. Assuming no vulnerability in TEE provers, we just need 1-of-N approval to accept the TEE block proof.
In case there’s a TEE vulnerability, the system automatically fallbacks to the zk-rollup security. And in such case, it’s suggested to stop the TEE verifier by on-chain governance.
We need enough TEE provers to ensure high availability. So the L2 should give some basic incentive to the TEE provers. The provers can have multiple implementations and run on decentralized TEE networks (e.g. Phala Network, currently there are ~35k SGX workers available on Phala Network).
Prover Implementation
Since the prover doesn’t need to generate the real ZK proof. We can run the Polygon zkEVM prover in the mock mode inside TEE. It will sync the blockchain from L1 (or DA), rebuild the L2 database, and reproduce the state root for each L2 block. We can easily produce block proofs by signing the every newly calculated state root with sk
. It’s proven by Flashbots team that running a Ethereum node in Intel SGX is possible.
It’s also possible to move the L2 database out from SGX to minimize the memory footprint of the prover. We can read the state database from a public RPC node on-demand when executing a L2 block. The states kv pairs come with the merkle proof committed to the old state root. So all the state reads can be verified.
Other benefits
- Fast Finality
- ZK prover and TEE prover can accept proof in parallel. Usually TEE proof can arrive first because it has less overlead.
- Users can decide the criteria to consider a block “finalized” based on their demand
- Non critical use cases: faster TEE finality
- Critical use cases: more secure TEE + ZK multi-proof finality
- Flexible ZKP + TEE combination
- For security: 100% blocks has ZK + TEE proof
- Trade-off for cost: x% ZKP (e.g. Taiko zk-rollups has 99% TEE only, and 1% ZK + TEE proof)