Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prune blobs #3852

Merged
merged 64 commits into from
Feb 8, 2023
Merged

Prune blobs #3852

merged 64 commits into from
Feb 8, 2023

Conversation

emhane
Copy link
Contributor

@emhane emhane commented Jan 5, 2023

Issue Addressed

#3625 (Secondary: blob pruning)

Proposed Changes

Prune blobs after MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS elapsed

Additional Info

@emhane emhane marked this pull request as draft January 5, 2023 17:34
@emhane emhane added the deneb label Jan 5, 2023
@emhane emhane added the ready-for-review The code is ready for review label Jan 8, 2023
@emhane emhane removed the ready-for-review The code is ready for review label Jan 9, 2023
@emhane emhane added the ready-for-review The code is ready for review label Jan 10, 2023
@emhane
Copy link
Contributor Author

emhane commented Jan 10, 2023

I'm not sure what we want to use this field for, a copy of the head of the chain or the block root mapping to the last stored blobs sidecar? For the latter, since we don't store empty blobs sidecars, it doesn't serve as data to calculate if more than MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS have passed since the oldest_blob_parent to the head of the chain. https://github.com/emhane/lighthouse/blob/5470c083ffe33ff6c64208743acab38f56f44435/beacon_node/store/src/metadata.rs#L133-L134

@emhane emhane changed the title WIP: Prune blobs Prune blobs Jan 10, 2023
@emhane emhane marked this pull request as ready for review January 10, 2023 09:01
beacon_node/beacon_chain/src/beacon_chain.rs Outdated Show resolved Hide resolved
database_manager/src/lib.rs Outdated Show resolved Hide resolved
beacon_node/store/src/errors.rs Outdated Show resolved Hide resolved
beacon_node/store/src/config.rs Show resolved Hide resolved
beacon_node/beacon_chain/src/builder.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hot_cold_store.rs Outdated Show resolved Hide resolved
beacon_node/beacon_chain/src/beacon_chain.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hot_cold_store.rs Outdated Show resolved Hide resolved
@realbigsean
Copy link
Member

Wow, this one's turning out to be pretty tricky! Left some comments and would also like to loop in @michaelsproul to see what he thinks

I'm not sure what we want to use this field for

Feel free to delete that one or restructe that struct however you need, I made it thinking we would have a more decoupled block + blob sync design, so really was just best-guessing on fields it would require.

Copy link
Member

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sorry for the very long review, this melted my brain a little because I was torn between reviewing the conceptual design and the code.

I think for blobs we may need to use a pruning algorithm that's quite different from the ones we use for blocks, states and payloads.

The fundamental reason for this is that we need to prune with or without finality (see discussion here ethereum/consensus-specs#3141).

The high-level algorithm I'm thinking is something like this:

  • Keep try_prune_blobs method, but call it regularly from after_new_head on a background thread. Make it cheap to bail out if pruning doesn't need to occur yet.
  • Define whether pruning needs to occur in terms of the last_pruned_epoch, the current epoch, and a new configuration parameter epochs_per_blob_prune which optionally prevents pruning every single epoch. The condition for pruning becomes:
let new_pruned_epoch = current_epoch - *MIN_EPOCHS_FOR_BLOB_SIDECARS_REQUESTS;
let need_to_prune = last_pruned_epoch + epochs_per_blob_prune <= new_pruned_epoch;
  • When need_to_prune == true, prune the blobs using a forwards block iterator that uses the freezer block roots, and backtracks from the head state if it needs to. Ideally to keep this cheap we have an Arc clone of the head state on hand which we pass in a lazy closure to forwards_block_roots_iterator_until. The pruning proceeds roughly as it does now, moving from last_pruned_epoch to new_pruned_epoch and deleting as it goes.
  • After pruning finishes we updated last_pruned_epoch to new_pruned_epoch and store that on disk.
  • In addition, unconditionally delete blobs for any blocks that conflict with finalization here:

.flat_map(|block_root: Hash256| {
[
StoreOp::DeleteBlock(block_root),
StoreOp::DeleteExecutionPayload(block_root),
]
})

This assumes that blob availability guarantees do not need to apply to orphaned blocks that are permanently discarded due to finalization. However maybe this isn't a safe assumption in case of a maliciously finalized chain as Dankrad describes here: ethereum/consensus-specs#3141 (comment). If we wanted to handle that case we'd need a way of collecting blobs from all chains in the pruning window. We could do this using the head tracker as we do for block and state pruning (see this), although that will run slower.

Another complication that my pruning algorithm doesn't address is that pruned blobs should be considered available even after they've been pruned: ethereum/consensus-specs#3169. One way to handle that would be to store a smaller bit of data in a separate column in the database mapping BlockRoot => (). Basically a set of block roots for all pruned blocks. We could probably get away with just storing this for orphaned blocks that aren't finalized, as we can already lookup whether a block root is part of the finalized canonical chain (which would imply availability?)

Ok that was a lot! Hopefully this gives us some directions for discussion and progress 🤞

beacon_node/beacon_chain/src/beacon_chain.rs Outdated Show resolved Hide resolved
beacon_node/beacon_chain/src/builder.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hot_cold_store.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hot_cold_store.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hot_cold_store.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hot_cold_store.rs Outdated Show resolved Hide resolved
@emhane emhane added ready-for-review The code is ready for review and removed ready-for-review The code is ready for review labels Jan 12, 2023
@emhane emhane force-pushed the prune_blobs branch 3 times, most recently from ae97dbd to f428c41 Compare January 15, 2023 09:14
Copy link
Member

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates look good!

I need to have a more thorough review when I've got more time, but this looks spot on.

beacon_node/store/src/hot_cold_store.rs Outdated Show resolved Hide resolved
@emhane emhane added ready-for-review The code is ready for review and removed ready-for-review The code is ready for review labels Jan 18, 2023
Copy link
Member

@realbigsean realbigsean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great ! Awesome job @emhane

I've told people to use start using this branch on devnet 4 anyways so we could rebase on latest capella https://github.com/realbigsean/lighthouse/tree/devnet-4-compat

@realbigsean realbigsean merged commit 4156719 into sigp:eip4844 Feb 8, 2023
@zhiqiangxu
Copy link
Contributor

Is there a plan to support pruning blobs automatically instead of manually?

@michaelsproul
Copy link
Member

@zhiqiangxu It is already automatic

@zhiqiangxu
Copy link
Contributor

@michaelsproul Oh I missed the process_prune_blobs which does pruning automatically in the background, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deneb ready-for-review The code is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants