Light Clients: Independent Verification

The network needs to support trustless clients that verify consensus output independently. These “execution nodes” or “light clients” are used by operators who want to track the canonical chain but do not want to store historic data.

All execution results are stored in memory and garbage collected over time.

One use case is creating attestations for bridging.

1 Like

Clarifying scope

There is an important consideration to make between “light clients” and “execution nodes”.

Light Clients

Light clients rely on execution results from other nodes. The goal for light clients is to verify execution results without independently executing the blocks themselves. These clients are still secure and are much faster/lighter than “execution nodes”.

Light clients receive and validate consensus output using validator public keys and asymmetric cryptography. If the light client considers the output valid, it makes a request to get the latest header from a full validator node that matches the “parent_beacon_block_root” and “nonce”. These header values correspond to TN consensus output digest and consensus sequence number for the round.

Using a fetched block from a known validator (verifiable on-chain) allows the light client to quickly and efficiently obtain the latest state of Telcoin Network without having to execute every transaction itself.

Execution Nodes

Execution nodes are stateless clients that independently execute output from consensus. The execution results are not persisted and only stored in memory. The execution node clears up memory over time to prevent exhausting resources. These nodes must download all worker blocks that reach quorum and verify state transitions are valid according to consensus.

By downloading worker blocks as they reach quorum, execution nodes likely have all the blocks they need to execute consensus output. However, the nodes must also have access to download any blocks they’re missing in order to fully execute the output. The missing blocks should be available from nodes that specialize in data availability to limit the number of requests from peers outside the network of nodes that reach consensus.

Ethereum plans to support “stateless clients” eventually, but Verkle trees are a prerequisite for their implementation. Telcoin Network may support these later, but chooses to prioritize a separate node type for data availability.

Validators waiting to participate in consensus must receive output from consensus, download all worker blocks that reach quorum, independently execute all transactions, and store all data. The “non-voting validators” (NVVs) nodes act as an important buffer for data availability and reduce the burden on “committee voting validators” (CVVs), which are responsible for reaching consensus and extending the canonical chain. NVVs support the network as archives for the epoch in which they are not voting. It also supports epoch transitions by ensuring validators are online and ready to participate. The network’s integrity could benefit from NVVs attesting to CVV output, but is outside the scope of this improvement proposal.

Careful consideration is needed to balance the burden for CVVs to reliably broadcasting sealed artifacts to peers within the voting committee and broadcasting sealed artifacts to NVVs for wider network propagation. This is a known scalability issue, but disregarded for now in order to prioritize protocol features over optimization.

3 Likes

Security hypothetical: Fake Bridge Transaction

Scenario

The malicious actor creates a fake committee with BLS keys that they control. The validator’s workers are deployed and seemingly running the protocol successfully to produce blocks. However, the network was constructed with an invalid genesis state that benefited the attacker.

In this scenario, the attacker is trying to trick the bridge into migrating funds.

The attacker has funded execution layer accounts (secp256k1) with a generous amount of TEL and they want to bridge off Telcoin Network to Ethereum.

Light client experience

A new validator joins the Axelar network to attest that bridging transactions are legitimate from Telcoin Network to Ethereum. They use a discovery mechanism that has been hijacked by the attackers. They see the malicious committee as valid.

The malicious committee transitions state to show a transaction bridging TEL to Ethereum.

Outcome

The transactions would not succeed because the locked TEL on Ethereum would not match the request. Axelar would not be able to unlock the amount of funds allocated by the attacker to their account.

4 Likes

Bridging as safe as consensus (BASAC)

Telcoin Network is fundamentally tied to the bridging process. Without a secure, successful, efficient bridging solution, Telcoin Network can’t survive. Bridging is so important, one could argue it’s as important as consensus itself.

Built-in light client support

Because bridging is arguably as important as consensus, validators should fully support efficient ways to verify state changes. Ideally, anyone can use light clients but the scope of this proposal should prioritize bridging attestation clients.

Validators already produce full blocks for full execution clients, but they should also create “light blocks” with all the information light clients need to verify the next block.

Signed messages

Validator public keys are available on-chain, so any light client with a known block can use that to verify committees and signatures. Once a light client has its “genesis” root (the well-known block it uses as a root to verify future blocks - could also be TN genesis block, but doesn’t have to match), it collects gossiped light blocks to track state changes. All nodes must gossip light blocks to support data availability for bridging services.

Light clients simply collect light blocks from a quorum of validators for the round of consensus.

Technical flow

  • Light client starts with well-known genesis block that the node operator is responsible for manually verifying. For this flow, assume the light client’s well-known block is also the genesis block from epoch 0.
  • Light client reaches out to RPC for handshake (limited to only NVVs?)
    • Validator adds light client to peer list. This list is used to gossip light blocks for new rounds of consensus.
    • Validator may or may not validate light client credentials. If for some reason merkle proofs bog down the network, it may be necessary to prioritize connections for light clients involved with bridging. For now, focus on happy path: Validator simply acknowledges that they are now peers and replies with it’s version of the public committee information that the light client can use to discover other validators.
    • Light clients must reach out to validators and introduce themselves?
  • CVV commits new round of consensus and creates consensus output and signs a light block.
    • Gossip light block to NVVs
    • NVVs verify and gossip light block
    • Light clients verify and gossip light block
  • Client receives light block through gossip network or directly from peer validator
    • Client verifies the integrity of the message
    • Client reaches out to start downloading block data?
      • At this point, would it be better for protocol to support the data bridging needs through a special RPC endpoint?
        • CVV uses worker to sign bridge requests for account info, any relevant logs?
        • How much of this should be baked into protocol vs EVM agnostic solution?
        • I think lean towards EVM agnostic for now. It places more execution burden on the light client, but reduces the lift for protocol devs in the short term. Execution code already exists, but protocol devs would need to develop special RPC endpoints to support custom bridging info requests.
        • If only creating a “bridging client”, then the validators could create a “bridge block” instead of a “light block” that contains specific account info and logs needed for bridging.
  • Client tracks messages for latest round. Once enough messages (2f+1) are collected from CVVs for the round, the client considers the round validated.
    • For example, in a 4 node network for “Round X”:
      • Light client needs at least 3 signed messages for round x with matching data
    • The light client starts downloading data for the round from a peer once it receives the first signed message

Outstanding considerations

  • Is eth_getProof rpc call sufficient for gossiped blocks without additional light block type?
  • If committed to light blocks, should light block messages be signed by the primary or is worker sufficient?
    • If worker signs, then this could be used for follow-up exchanges
      • Light block only contains roots for merkle proof, but bridging light clients are looking for on-chain events and account balances.
      • Light client could verify any request with worker signature for special rpc endpoint, BUT this creates the need for light client verification to prevent DoS attacks by having working constantly signing data
  • TN might include a new rpc method for finding block data by round (aka block nonce)
    • Consensus round could produce multiple execution blocks
  • TN as a “friendly network” where peer connections for workers are included in worker blocks so bridging clients have more support
    • Protocol could propagate bridging peers to better support discoverability
      • Beneficial for supporting bridge-specific clients, but less ideal for trustless light client approach
  • Verifying a single block requires all data from the round
    • To re-execute a block, a client would need only the parent execution block, but then all of the data from consensus output for the round
2 Likes

Agreed on aiming for keeping our bridging infra as EVM agnostic as possible since that is how Axelar’s infra is designed to be used. Introducing new RPC endpoints and baking more complexity into the protocol for bridging purposes increases attack surface area and can introduce more tech debt than we’d want to take on, especially considering the likelihood of Axelar pushing updates to their protocol which could include breaking changes that require us to reorganize TN at the protocol layer

The boilerplate voting-verifier setup which we will be replacing with the light client uses the following components:

Ampd is the daemon which attaches to Axelar network and will listen for events (such as those reported by a relay tx) as well as connects to Telcoin Network via RPC where it can verify events on the TN side.

Tofnd is a dependency for Ampd, providing signing capabilities for transactions and batches.

Ampd checks the TN rpc endpoint to verify TN event emission when it is informed of a new bridging request (kickstarted by a the start of a “signing session”, ie a vote). If the bridge request is valid as confirmed by the RPC, then Ampd uses Tofnd to “vote yea” by signing a transaction and submitting it to the voting-verifier. This is the functionality that we will need to connect to the light client’s RPC.

I am still developing my understanding of these things but it really seems like the light client approach is very similar to the voting-verifier, with the main difference being the voting-verifer collects signatures from multiple Ampd+Tofnd instances by weight and uses them to construct a multisig signature on the multisig-prover when the weighted threshold is reached.

1 Like

Grant and I just spoke to Ben and Stephen @ Axelar to clarify some of the bridging architecture and its relevance to TN

For security, simplicity, and forward-compatibility with potential Axelar changes we will use all of their architecture’s default components, listed below

TN-specific deployments:

  • external gateway (on TN)
  • internal gateway (on Axelar)
  • voting verifier
  • multisig prover

Axelar general components:

  • router (routes verified messages between internal gateways)
  • governance (used to natively integrate TN with Axelar and handles upgrades of our external gateway)

Major takeaways that we will need to tackle:

  • implement relayer for outbound (from TN) bridge txs. this relayer must quasi-reconstruct consensus by collecting 2n+1 signatures from validators that a bridge event happened
  • implement verifier. this verifier(s) must verify the above signatures and vote until quorum is reached for the bridge message to be forwarded to the router
  • decide on number of verifier instances & their rewards schemas (for verifying messages from the relayer). These verifier instances will also serve to sign incoming messages from other chains via the multisig prover (and be compensated accordingly)

Other smaller miscellaneous takeaways:

  • users pay their own gas, fronting enough gas for all axelar internal costs and source chain costs
  • only support ethereum ↔ TN to start for simplicity & security

Here is a helpful screenshots for clarifying mental models of the bridging process:

Diagram Breakdown:

  1. User locks $TEL in external gateway on TN, emitting ContractCall() event. These events are special occurrences for TN validators, who sign relevant data to commit that they executed this bridging state change and provide it to the quasi-consensus relayer
  2. Once the quasi-consensus relayer has received 2n+1 validator-signed commits of the bridging event, it initiates a tx on Axelar chain calling verify_messages() on Axelar’s internal gateway (which is TN-specific)
  3. Axelar internal TNGateway then kicks off verification by calling the TN voting-verifier who in turn commences a poll
  4. verifier instances monitor the voting-verifier and notice the start of a new poll, leading them to initiate vote txs to Axelar chain. They should only vote yes if they find 2n+1 valid signatures originated by validators for the bridge message
  5. poll ends
  6. voting-verifier responds to the internal gateway on whether quorum of yea votes for the bridge message was reached
  7. Once a message has been validated, a tx can be initiated on internal_gateway::route_messages() to forward the bridge message to the router
  8. router performs additional check to ensure the forwarded message is recorded as verified by the internal gateway (seems extraneous, I am probably missing something here)
  9. router looks at the destination chain member of the verified bridge message to identify which destination internal gateway to route to (ethereum in this case) and instructs the destination internal gateway to store the routed message
  10. external entity initiates an Axelar transaction calling the multisig-prover for the destination chain destination_multisig_prover::construct_proof() which first fetches the routed bridge message from the destination_internal_gateway
  11. Multisig prover initiates a signing session, returning a session id
  12. multisig_prover::construct_proof() pings the destination chain’s multisig contract to alert the destination chain’s verifiers that a signing session for session id has begun
  13. destination chain’s verifiers notice session id has started and submit voting transactions on the validity of the bridge message
  14. signing session completes, emitting an event
  15. multisig contract informs the multisig prover whether session id reached quorum of verifier votes. If successful, the bridge message is finalized and stored as a proof
  16. relayer for the destination chain (ethereum) can now query destination_multisig_prover::getProof() and use it to execute the bridged msg at the destination chain
1 Like