Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Lighthouse beacon node peer #47

Closed
wants to merge 17 commits into from

Conversation

jimmygchen
Copy link
Contributor

@jimmygchen jimmygchen commented Nov 5, 2022

This PR adds a Lighthouse beacon node peer.

This is still WIP as the Lighthouse team is currently working on stablizing the eip4844 branch. At the moment I am still getting compilation errors on the branch when building locally - so will keep following changes on the branch.

I have not added generating genesis and running bootnodes with Lighthouse - these will probably come later if required.

@jimmygchen jimmygchen changed the base branch from master to devnet-v3 November 5, 2022 05:16
@jimmygchen jimmygchen changed the title Add lighthouse Add Lighthouse beacon node peer Nov 5, 2022
@jessepollak
Copy link

🔥🔥

Excited to land this!

@jimmygchen
Copy link
Contributor Author

jimmygchen commented Nov 6, 2022

I was hoping to get this to a state where the Lighthouse node is able to sync until the eip4844 fork epoch - given that there are still a few outstanding tasks before Lighthouse is ready for interop on devnet-v3. However, I'm seeing some issue with processing pre 4844 blocks when running Lighthouse and a Prysm beacon node. Not sure whether this is a config issue or something else.

The Lighthouse beacon node is able to sync with the Prysm beacon node initially. However, it doesn't like the blocks received from Prysm, and failed with BlockSignatureVerifierError(AttestationValidationError(BeaconStateError(InvalidBitfield)))

Initial Sync:

lighthouse-beacon-node-follower-1  | Nov 06 18:38:54.127 DEBG Execution client config is OK           service: exec
lighthouse-beacon-node-follower-1  | Nov 06 18:38:54.127 DEBG Received Status Response                fork_digest: [12, 224, 171, 2], finalized_epoch: 0, finalized_root: 0x0000…0000, head_slot: 0, head_root: 0x40df…3d10, peer_id: 16Uiu2HAmE5HQCj6QwPZ6HCHepqKdQZ21GhMT3wgB57aogZKEojSz, service: router
lighthouse-beacon-node-follower-1  | Nov 06 18:38:54.128 DEBG Execution engine online                 service: exec
lighthouse-beacon-node-follower-1  | Nov 06 18:38:54.128 DEBG Execution engine upcheck complete       state: Synced, service: exec
lighthouse-beacon-node-follower-1  | Nov 06 18:38:54.129 DEBG Peer transitioned sync state            is_connected: true, their_finalized_epoch: 0, their_head_slot: 0, our_finalized_epoch: 0, our_head_slot: 0, new_state: Synced, peer_id: 16Uiu2HAmE5HQCj6QwPZ6HCHepqKdQZ21GhMT3wgB57aogZKEojSz, service: sync
lighthouse-beacon-node-follower-1  | Nov 06 18:38:54.129 INFO Sync state updated                      new_state: Synced, old_state: Stalled, service: sync

Received first block, but block signature verification failed due to invalid bitfield:

lighthouse-beacon-node-follower-1  | Nov 06 18:38:56.068 DEBG Successfully verified gossip block      root: 0xa0f604af6ee5b0f07728b77fd9166a697a878f23423e34f25eb9854000c95c90, slot: 1, graffiti: , service: beacon
lighthouse-beacon-node-follower-1  | Nov 06 18:38:56.070 INFO New block received                      root: 0xa0f604af6ee5b0f07728b77fd9166a697a878f23423e34f25eb9854000c95c90, slot: 1
lighthouse-beacon-node-follower-1  | Nov 06 18:38:56.073 CRIT Beacon block processing error           error: BlockSignatureVerifierError(AttestationValidationError(BeaconStateError(InvalidBitfield))), service: beacon
lighthouse-beacon-node-follower-1  | Nov 06 18:38:56.074 DEBG Invalid gossip beacon block             block slot: 1, block root: 0xa0f604af6ee5b0f07728b77fd9166a697a878f23423e34f25eb9854000c95c90, outcome: Err(BeaconChainError(BlockSignatureVerifierError(AttestationValidationError(BeaconStateError(InvalidBitfield)))))
lighthouse-beacon-node-follower-1  | Nov 06 18:38:56.075 DEBG Peer score adjusted                     score: -5.00, peer_id: 16Uiu2HAmE5HQCj6QwPZ6HCHepqKdQZ21GhMT3wgB57aogZKEojSz, msg: bad_gossip_block_ssz, service: libp2p

Not sure if I'm looking at the right thing here, but looking at the Prysm beacon node, the attestation in slot 1 contains only 1 aggregation bit (0x03, minus the start marker) and there is only 1 validator in the slot 0 committee.. so I'm not sure why Lighthouse is seeing InvalidBitfield 🤔

Any help is appreciated!

image

@Inphi
Copy link
Owner

Inphi commented Nov 7, 2022

Might have something to do with the way prysm generates mock validators for interop testing. We could try a getting rid of prysm's interop test flags to see if that resolves the issue.

@Inphi
Copy link
Owner

Inphi commented Nov 7, 2022

I'm unable to build the lighthouse image locally. Any ideas what's causing this?

➜  eip4844-interop git:(lighthouse) ✗ dc build lighthouse-beacon-node-follower
Building lighthouse-beacon-node-follower
Sending build context to Docker daemon  46.21MB
Step 1/10 : FROM rust:1.65.0-bullseye AS builder
 ---> e9654080c167
Step 2/10 : RUN apt-get update && apt-get -y upgrade && apt-get install -y cmake libclang-dev protobuf-compiler
 ---> Using cache
 ---> 1f7e3ecfe8f3
Step 3/10 : COPY lighthouse lighthouse
 ---> Using cache
 ---> b2c3d0ddadd3
Step 4/10 : ARG FEATURES
 ---> Using cache
 ---> 72925d300bb8
Step 5/10 : ENV FEATURES $FEATURES
 ---> Using cache
 ---> 35fdc219a149
Step 6/10 : RUN cd lighthouse && make && make install-lcli
 ---> Using cache
 ---> c3eb7d279dcd
Step 7/10 : FROM ubuntu:22.04
 ---> a8780b506fa4
Step 8/10 : RUN apt-get update && apt-get -y upgrade && apt-get install -y --no-install-recommends   libssl-dev   ca-certificates   curl   iproute2   jq   && apt-get clean   && rm -rf /var/lib/apt/lists/*
 ---> Running in 051d95c3d393
Get:1 http://security.ubuntu.com/ubuntu jammy-security InRelease [110 kB]
Get:2 http://archive.ubuntu.com/ubuntu jammy InRelease [270 kB]
Get:3 http://security.ubuntu.com/ubuntu jammy-security/main amd64 Packages [579 kB]
Get:4 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [114 kB]
Get:5 http://security.ubuntu.com/ubuntu jammy-security/multiverse amd64 Packages [4644 B]
Get:6 http://security.ubuntu.com/ubuntu jammy-security/universe amd64 Packages [757 kB]
Get:7 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [99.8 kB]
Get:8 http://archive.ubuntu.com/ubuntu jammy/multiverse amd64 Packages [266 kB]
Get:9 http://security.ubuntu.com/ubuntu jammy-security/restricted amd64 Packages [480 kB]
Get:10 http://archive.ubuntu.com/ubuntu jammy/main amd64 Packages [1792 kB]
Get:11 http://archive.ubuntu.com/ubuntu jammy/universe amd64 Packages [17.5 MB]
Get:12 http://archive.ubuntu.com/ubuntu jammy/restricted amd64 Packages [164 kB]
Get:13 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 Packages [940 kB]
Get:14 http://archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 Packages [528 kB]
Get:15 http://archive.ubuntu.com/ubuntu jammy-updates/multiverse amd64 Packages [16.9 kB]
Get:16 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages [881 kB]
Get:17 http://archive.ubuntu.com/ubuntu jammy-backports/main amd64 Packages [3175 B]
Get:18 http://archive.ubuntu.com/ubuntu jammy-backports/universe amd64 Packages [7290 B]
Fetched 24.5 MB in 3s (7151 kB/s)
Reading package lists...
E: Problem executing scripts APT::Update::Post-Invoke 'rm -f /var/cache/apt/archives/*.deb /var/cache/apt/archives/partial/*.deb /var/cache/apt/*.bin || true'
E: Sub-process returned an error code
The command '/bin/sh -c apt-get update && apt-get -y upgrade && apt-get install -y --no-install-recommends   libssl-dev   ca-certificates   curl   iproute2   jq   && apt-get clean   && rm -rf /var/lib/apt/lists/*' returned a non-zero code: 100
ERROR: Service 'lighthouse-beacon-node-follower' failed to build : Build failed

@jimmygchen
Copy link
Contributor Author

@Inphi hmm I'm not seeing the same error, looks like it hasn't even started executing make lighthouse
What version of Docker Engine are you on? I'm on 20.10.20. Could be related to this issue:
readthedocs/readthedocs.org#9171

@Inphi
Copy link
Owner

Inphi commented Nov 8, 2022

@Inphi hmm I'm not seeing the same error, looks like it hasn't even started executing make lighthouse What version of Docker Engine are you on? I'm on 20.10.20. Could be related to this issue: readthedocs/readthedocs.org#9171

Thanks that fixed the issue.

Noticed another issue where the lighthouse's fork digest doesn't match prysm's:
image

For interop, we override the default fork digest (because prysm needs unique fork versions when a custom chain config is used). This is specified in the shared chain-config so I'm wondering if lighthouse could be ignoring the configs.

@jimmygchen
Copy link
Contributor Author

jimmygchen commented Nov 9, 2022

@Inphi hmm isn't it quite common to override the fork schedule on custom networks? I don't see why Lighthouse would be ignoring those :/ It looks like the error usually happens at slot 3, before transitioning to epoch 1. I wonder the fork digest issue is related to the previous issue with processing blocks, since the lighthouse node may not be upgrading to the next fork, resulting in different fork digest?

There are still a few outstanding issue before Lighthouse is ready for interop on devnet v3. I'm inclined to try this again once we get the EF consensus specs to pass on the lighthouse eip4844 branch to iron out potentially issues as the branch isn't stable yet

@jimmygchen
Copy link
Contributor Author

I'm just guessing though - but if I increase the fork epochs, the error doesn't show up immediately at slot 3.

Fork related config values loaded in Prysm:

GenesisForkVersion:[0 0 15 254]
AltairForkVersion:[1 0 15 254]
AltairForkEpoch:1
BellatrixForkVersion:[2 0 15 254]
BellatrixForkEpoch:2
ShardingForkVersion:[8 0 0 0]
ShardingForkEpoch:18446744073709551615
Eip4844ForkVersion:[3 0 15 254]
Eip4844ForkEpoch:3
CapellaForkVersion:[3 0 0 0]
CapellaForkEpoch:18446744073709551615
ForkVersionSchedule:map[[0 0 0 0]:0 [1 0 0 0]:74240 [2 0 0 0]:144896 [4 0 0 0]:18446744073709551615]
ForkVersionNames:map[[0 0 0 0]:phase0 [1 0 0 0]:altair [2 0 0 0]:bellatrix [4 0 0 0]:eip4844]

@Inphi
Copy link
Owner

Inphi commented Nov 10, 2022

Seems there's a config issue:

➜  eip4844-interop git:(lighthouse) ✗ curl -s 'localhost:5052/eth/v1/config/spec'  | jq .data.SLOTS_PER_EPOCH
"32"
➜  eip4844-interop git:(lighthouse) ✗ curl -s 'localhost:3500/eth/v1/config/spec'  | jq .data.SLOTS_PER_EPOCH
"3"

The shared chain config sets the SLOTS_PER_EPOCH to 3

@Inphi
Copy link
Owner

Inphi commented Nov 10, 2022

Seems related to this. I'll update the configs to use the minimal preset. (Note that there's a eip4844 preset defined used for pyspec testing. But we don't wanna use this because the FIELD_ELEMENTS_PER_BLOB conflicts with what we use in Execution).

@jimmygchen
Copy link
Contributor Author

Seems related to this. I'll update the configs to use the minimal preset. (Note that there's a eip4844 preset defined used for pyspec testing. But we don't wanna use this because the FIELD_ELEMENTS_PER_BLOB conflicts with what we use in Execution).

Ah, nice find! I'll update the Dockerfile to build a minimal preset Lighthouse image.

@jimmygchen
Copy link
Contributor Author

Looks like PRESET_BASE was recently removed in this commit, so I guess it's now defaulting to mainnet for most clients. However, Lighthouse requires the PRESET_BASE to be defined in the config file, and doesn't run without it.

@Inphi what was the reason to have a lower SLOTS_PER_EPOCH? The current config doesn't work with Lighthouse, so I think we'll have to either use the minimal preset, or stick with the SLOTS_PER_EPOCH defined in the mainnet preset? what do you think?

@Inphi
Copy link
Owner

Inphi commented Nov 14, 2022

the slots per epoch is reduced for a more responsive and quicker testing during local development. It's not essential to run the e2e tests.

I tried using the minimal preset but ran into a couple issues. Is it possible to define a new preset for lighthouse?

@jimmygchen
Copy link
Contributor Author

jimmygchen commented Nov 15, 2022

What issue were you seeing, is it this one?

eip4844-interop-prysm-beacon-node-1  | time="2022-11-15 02:55:12" level=fatal msg="Could not save interop genesis state" error="could not save genesis state: --.Slashings (bytes array does not have the correct length): expected 8192 and 64 found" prefix=deterministic-genesis

Looks like it's something to do SSZ proto templating variables defined here - it might require a custom Prysm build to run minimal preset, I read mention of building with --//proto:network=minimal but haven't been able to build successfully.

I think it's possible to define a new preset for lighthouse with code changes, but I'm guessing that would require either adding the preset to other clients too, or use a custom config file for lighthouse - latter is probably easier.

@roberto-bayardo roberto-bayardo deleted the branch Inphi:devnet-v3 November 17, 2022 17:50
@Inphi
Copy link
Owner

Inphi commented Nov 21, 2022

Apologies for not responding on time. I've been sick for the past week and I'm now coming up to speed. Did the syncing issue get resolved?

@jimmygchen
Copy link
Contributor Author

Hey @Inphi no worries, hope you're feeling better now.
I haven't made much progress on this, but last time I got stuck with building and running Prysm on minimal preset. I'm guessing it requires a custom build with minimal network parameter, but I wasn't able to get it working running it with

bazel build //beacon-chain:beacon-chain --//proto:network=minimal

I've asked the question on Prysm discord:
https://discord.com/channels/476244492043812875/781210739762397234/1044391783557234759

@jimmygchen
Copy link
Contributor Author

Seems related to this. I'll update the configs to use the minimal preset. (Note that there's a eip4844 preset defined used for pyspec testing. But we don't wanna use this because the FIELD_ELEMENTS_PER_BLOB conflicts with what we use in Execution).

@Inphi hmm I see that we're using FIELD_ELEMENTS_PER_BLOB=4096 in execution , if this is the value we are going to use for interop testing, we may need a custom preset in Lighthouse, given that preset configs like SLOTS_PER_EPOCH and FIELD_ELEMENTS_PER_BLOB are not rewritable at run time.

@jimmygchen
Copy link
Contributor Author

I'll look into creating a new preset in Lighthouse for this.

@Inphi
Copy link
Owner

Inphi commented Nov 24, 2022

would it be easier to use the mainnet preset? it seems even lodestar will need another custom preset and we should try to keep it super easy for other clients.

@jimmygchen
Copy link
Contributor Author

would it be easier to use the mainnet preset? it seems even lodestar will need another custom preset and we should try to keep it super easy for other clients.

Yes, that would be much easier!

@jimmygchen
Copy link
Contributor Author

@realbigsean is continuing work on this branch, I'm going to do some testing with this as well!
https://github.com/realbigsean/eip4844-interop/tree/interop-lighthouse

@jimmygchen jimmygchen mentioned this pull request Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants