-
Notifications
You must be signed in to change notification settings - Fork 989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable light client data backfill by tracking best SyncAggregate
#3614
base: dev
Are you sure you want to change the base?
Conversation
Beacon nodes can only compute light client data locally if they have the corresponding `BeaconState` available. This is not the case for blocks before the initially synced checkpoint state. The p2p-interface defines endpoints to sync light client data, but it only supports forward sync. To enable beacon nodes to backfill light client data, we must ensure that a malicious peer cannot convince us of fraudulent data. While it is possible to verify light client data against the locally backfilled blocks, blocks are not necessarily available anymore on libp2p as they are subject to `MIN_EPOCHS_FOR_BLOCK_REQUESTS`. Light client data stays relevant for more than 5 months, and without validating it against local block data it is impossible to distinguish canonical light client data from fraudulent light client data that eventually culminates in a shared history; the old periods in that case could still be manipulated. Furthermore, agreeing on canonical data improves caching performance and is relevant, e.g., for the portal network. To support efficient proof that a `LightClientUpdate` is canonical, it is proposed to minimally extend the `BeaconState` to track the best `SyncAggregate` of the current and previous sync committee period, according to an implementation-independent ranking function. The proposed ranking function is compatible with what consensus nodes implementing ethereum#3553 are already making available across libp2p and REST transports. It is based on and compatible with the `is_better_update` function in `specs/altair/light-client/sync-protocol.md`. There are three minor differences to `is_better_update`: 1. `is_better_update` runs in the LC, so runs without fork choice. It needs extra conditions to prefer older data over newer data. The `BeaconState` ranking function can use simpler logic. 2. The LC is always initialized from a post-Altair finalized checkpoint. This assumption does not hold in theoretical edge cases, requiring an extra guard for `ALTAIR_FORK_EPOCH` in the `BeaconState` function. 3. `is_better_update` has to deal with BNs serving incomplete data while they are still backfilling. This is not the case with `BeaconState`. Once the data is available in the `BeaconState`, a light client data backfill protocol could be defined that serves, for past periods: 1. A `LightClientUpdate` from requested `period` + 1 that proves that the entirety of `period` is finalized. 2. `BeaconState.historical_summaries[period].block_summary_root` at (1)'s `attested_header.beacon.state_root` + Merkle proof. 3. For each epoch's slot 0 block within requested `period`, the corresponding `LightClientHeader` + Merkle multi-proof for the block's inclusion into (2)'s `block_summary_root`. 4. For each of the entries from (3) with `beacon.slot` within `period`, the `current_sync_committee_branch` + Merkle proof for constructing `LightClientBootstrap`. 5. If (4) is not empty, the requested `period`'s `current_sync_committee`. 6. The best `LightClientUpdate` from `period`, if one exists, + Merkle proof that its `sync_aggregate` + `signature_slot` is selected as the canonical best one in (1)'s `attested_header.beacon.state_root`. Only the proof in (6) depends on `BeaconState` tracking the best light client data. This modification would enshrine the logic of a subset of `is_better_update`, but does not require adding any `LightClientXyz` data structures to the `BeaconState`.
Context: https://hackmd.io/@etan-status/electra-lc This is the next step needed to decentralize A deeply finalized checkpoint root could be integrated into the network's Also related: ethpandaops/checkpointz#143 |
# Sync history | ||
previous_best_sync_data=default_sync_data(), # [New in Electra] | ||
current_best_sync_data=default_sync_data(), # [New in Electra] | ||
parent_block_has_sync_committee_finality=(pre.slot == GENESIS_SLOT), # [New in Electra] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hwwhww the idea here is that, if a genesis state is produced, it should be True
. If the fork is applied later, it should be False
. Not sure how to best express that.
There are only two situations how the sync committees can already be finalized at the time the fork is activated:
- Genesis at >= Electra
- Electra not scheduled at historical roots boundary, which is not expected to happen on the relevant networks. If it would be the case, there would be a tiny de-rank for the blocks building on the initial Electra block, but nothing impactful.
This should be placed as an EIP in |
Beacon nodes can only compute light client data locally if they have the corresponding
BeaconState
available. This is not the case for blocks before the initially synced checkpoint state. The p2p-interface defines endpoints to sync light client data, but it only supports forward sync.To enable beacon nodes to backfill light client data, we must ensure that a malicious peer cannot convince us of fraudulent data. While it is possible to verify light client data against the locally backfilled blocks, blocks are not necessarily available anymore on libp2p as they are subject to
MIN_EPOCHS_FOR_BLOCK_REQUESTS
. Light client data stays relevant for more than 5 months, and without validating it against local block data it is impossible to distinguish canonical light client data from fraudulent light client data that eventually culminates in a shared history; the old periods in that case could still be manipulated. Furthermore, agreeing on canonical data improves caching performance and is relevant, e.g., for the portal network.To support efficient proof that a
LightClientUpdate
is canonical, it is proposed to minimally extend theBeaconState
to track the bestSyncAggregate
of the current and previous sync committee period, according to an implementation-independent ranking function. The proposed ranking function is compatible with what consensus nodes implementing #3553 are already making available across libp2p and REST transports. It is based on and compatible with theis_better_update
function inspecs/altair/light-client/sync-protocol.md
.There are three minor differences to
is_better_update
:is_better_update
runs in the LC, so runs without fork choice. It needs extra conditions to prefer older data over newer data. TheBeaconState
ranking function can use simpler logic.ALTAIR_FORK_EPOCH
in theBeaconState
function.is_better_update
has to deal with BNs serving incomplete data while they are still backfilling. This is not the case withBeaconState
.Once the data is available in the
BeaconState
, a light client data backfill protocol could be defined that serves, for past periods:LightClientUpdate
from requestedperiod
+ 1 that proves that the entirety ofperiod
is finalized.BeaconState.historical_summaries[period].block_summary_root
at (1)'sattested_header.beacon.state_root
+ Merkle proof.period
, the correspondingLightClientHeader
+ Merkle multi-proof for the block's inclusion into (2)'sblock_summary_root
.beacon.slot
withinperiod
, thecurrent_sync_committee_branch
+ Merkle proof for constructingLightClientBootstrap
.period
'scurrent_sync_committee
.LightClientUpdate
fromperiod
, if one exists, + Merkle proof that itssync_aggregate
+signature_slot
is selected as the canonical best one in (1)'sattested_header.beacon.state_root
.Only the proof in (6) depends on
BeaconState
tracking the best light client data. This modification would enshrine the logic of a subset ofis_better_update
, but does not require adding anyLightClientXyz
data structures to theBeaconState
.