Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attestation and Sync Committee aggregation selection #224

Merged
merged 10 commits into from
Jan 31, 2023

Conversation

OisinKyne
Copy link
Contributor

@OisinKyne OisinKyne commented Jul 18, 2022

Problem this PR intends to address

Distributed Validator Technology is the idea that a number of independent validator clients can operate together as one or more logical 32 ether validators using BLS signature aggregation. The proof of stake spec has been designed with the intention of being MPC friendly.

There is however an issue with the assignment of aggregation duties. These duties are determined by the hash of a slot_signature. In a distributed validator, each individual validator has a different BLS private key, so they all individually produce different slot_signatures that hash differently, so all attempt to aggregate at different incorrect occasions.

What does this PR do?

This PR adds two new endpoints:

/eth/v1/validator/beacon_committee_selections
/eth/v1/validator/sync_committee_selections

Screenshot 2022-11-28 at 19 02 25

These endpoints are optional for a consensus client, and are designed to be opt-in for validator clients (when something like a --distributed mode is enabled). These endpoints are implemented by the distributed validator middleware that intercepts this API.

When a validator is configured to run as part of a distributed validator cluster, at the start of an epoch (slot for sync committee) the validator client will determine its attestation duties then calculate a slot signature for that slot to determine whether it must also aggregate. In the distributed flow, it will send this slot_signature to the beacon client, and take the signature returned as a response as the one to use to calculate is_aggregator(). If this test returns true, and the validator must produce an aggregate attestation (or sync contribution), the combined signature (not the partial) must also be included in the AggregateAndProof object.

This extra step can be visualised in the aggregation flow like so:

Screenshot 2022-11-28 at 17 27 21

Things to note

  • This proposal has been changed from a /v2/ of another endpoint to its own /v1/ endpoint (along with a second for sync aggregations).
  • This endpoint need not be implemented on the CL side. Once a validator client is aware of it and able to use it when a feature flag is turned on, the intercepting middleware can handle and swallow the request. I suggest a CL either returns:
    • 501 Not Implemented
    • 400 Bad Request. To indicate to the user that they either have --distributed turned on for a non DV use case, or they have a distributed validator private key wired directly to a consensus client without any middleware in between.
  • Not implementing this would mean distributed validators do not aggregate as required (though there is no direct punishment for not aggregating currently), and the overall network effectiveness would be harmed.
  • The pros and cons and previous designs of this change can be read in more detail here.

rkapka
rkapka previously requested changes Aug 16, 2022
apis/validator/beacon_committee_subscriptions.v2.yaml Outdated Show resolved Hide resolved
items:
type: object
properties:
is_aggregator:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we maybe add validator_index here so it is explicit which validator is the aggregator? The implicit mapping using request array indexes feels brittle.

Copy link
Contributor

@xenowits xenowits Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. So, the response would instead look like:

[
  { "validator_index": 7, "is_aggregator": false},
  { "validator_index": 1, "is_aggregator": false},
  { "validator_index": 13, "is_aggregator": true}
]

Comment on lines 3 to 14
summary: Signal beacon node to prepare for a committee subnet
description: |
After beacon node receives this request:
- Search using discv5 for peers related to this subnet and replace current peers with those ones if necessary.
- Check each provided `slot_signature`, to determine whether any validator has been [assigned the duty of aggregation](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/validator.md#aggregation-selection)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The summary and description probably need to be updated to reflect the new flow. VCs do not "signal" the BN anymore, but rather "VC provides data to maybe signal and BN responds with wether it was signalled".

@rolfyone
Copy link
Collaborator

I'm curious why the validator isn't passing is_aggregator, but rather the signature? If the validator passes is_aggregator for the indices that are aggregating, then the middleware could read the post body...

Is there another need for the signature to be passed? If you have is_aggregator in the request, then the request body would have all the details that the middleware might need...

Not sure if this is covered elsewhere, but this is basically a sticky session kind of deal - if the request goes to 1 BN, the aggregate request would need to go to that same BN, otherwise likely wouldn't get a usable aggregate... or send the request to all (or subset and track) BN...
Posting the signed aggregate could land anywhere, it's just the request for the aggregate which would be sticky

@ajsutton
Copy link
Contributor

Is there another need for the signature to be passed? If you have is_aggregator in the request, then the request body would have all the details that the middleware might need...

The actual validator key is an aggregate of multiple different keys, each running in its own validator client. The middleware needs to capture the actual signature from each vc and aggregate them to get the signature for the validator that's actually on chain, then use that signature to determine if aggregation should be performed or not.

@OisinKyne
Copy link
Contributor Author

OisinKyne commented Oct 16, 2022

Hi @rolfyone :)

As mentioned by Adrian, we can't have a VC send a boolean of whether it should aggregate because right now each validator is hashing the wrong signature to correctly determine whether it should aggregate. For us to correctly figure out if a validator should aggregate, we need to combine a threshold of these signatures together to get the real slot_signature for the validator, which then gets hashed and modulo'd here to dictate whether or not you are a selected to aggregate.

In practice, this endpoint is designed to be intercepted, and might not need to ever be implemented on the beacon node side if the client teams don't want to. A middleware client intercepting these V2 requests and calculating the aggregation selection could subsequently call the CL's /V1/ endpoint with true/false. It would be the validator side of things that would need to be aware of this endpoint, and be able to use it when a feature flag is enabled.

Posting the signed aggregate could land anywhere, it's just the request for the aggregate which would be sticky

Correct it is important that all CLs get notified to prepare for an aggregate once it has been determined. The distributed validator clients come to consensus on what aggregate to present to the connected VCs for signing, so they should be tolerant to a subset of CLs not returning a meaningful aggregation to sign.

@mcdee
Copy link
Contributor

mcdee commented Oct 16, 2022

A middleware client intercepting these V2 requests and calculating the aggregation selection could subsequently call the CL's /V1/ endpoint ...

To date the V2 of an endpoint causes the deprecation of the V1 endpoint. If these endpoints are meant to co-exist (even if one is only used by the beacon node and the other by the middleware) then this should be a separate endpoint, not V2 of the existing endpoint.

@OisinKyne
Copy link
Contributor Author

OisinKyne commented Oct 17, 2022

To date the V2 of an endpoint causes the deprecation of the V1 endpoint. If these endpoints are meant to co-exist (even if one is only used by the beacon node and the other by the middleware) then this should be a separate endpoint, not V2 of the existing endpoint.

Hi @mcdee good to hear from you. We chatted as a team today and agree. We think this endpoint as proposed is arguably doing two jobs (telling the VC if it should aggregate, and telling the upstream BN to announce and listen on the topic, and begin preparing an aggregate), we think maybe we can make things more simple.

We are thinking we might re-work this PR, and change it to an independent v1 endpoint used only for returning an aggregate slot signature, to allow the VC to determine if they are an aggregator. (we will have a second almost identical endpoint for the sync committee aggregate). Then the flow would be:

  • If validator is in "distributed mode";
  • send slot_signatures to /eth/v1/validator/beacon_committee_selections
  • wait for a returned signature
  • check if selected to aggregate (sha(signature) mod 8 == 0)
  • if yes or no, VC calls /eth/v1/validator/beacon_committee_subscriptions as normal and uses the boolean to indicate aggregation
  • if yes, asks BN for an aggregate attestation at 2/3rds of the slot

So basically if the VC enables this mode they are taking the hit of making a blocking network call to know whether it should aggregate, but this is needed to allow DVs to aggregate such that we don't negatively impact attestation density network wide if DVT sees adoption.

P.S. We just added tentative support for vouch :)

@OisinKyne OisinKyne changed the title V2 prepareBeaconCommitteeSubnet WIP: V2 prepareBeaconCommitteeSubnet Oct 17, 2022
@OisinKyne OisinKyne changed the title WIP: V2 prepareBeaconCommitteeSubnet Attestation and Sync Committee aggregation selection Nov 28, 2022
@OisinKyne
Copy link
Contributor Author

Hey @michaelsproul, was there a reason to dismiss the previous approval on this PR? :) It showed as stale as I keep rebasing this PR as more commits go onto main. @rkapka 's request changes is from a much older version of this PR, so that one can probably be also marked resolved if he doesn't get a chance to look at it, had been planning to give him a week before chasing, though opting to tag now while I'm posting this.

Cheers.

@michaelsproul
Copy link
Collaborator

Hey @OisinKyne not sure what happened there sorry, I didn't mean to do that. I think sometimes reviews get marked stale when a conflict happens, I'm not sure why my name was associated with it. Maybe because I merged this unrelated PR? #285.

to correctly determine if one of its validators has been selected to perform an aggregation duty in this slot.
Consensus clients need not support this endpoint and may return a 501.
tags:
- Validator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should have a new tag, either Distributed Validator or something, to allow people to easily see what distributed validators are implementing...

@rolfyone
Copy link
Collaborator

This looks pretty good, have a stab at adding to the top level CHANGES.md too :)

@OisinKyne OisinKyne dismissed a stale review via aca8b38 January 30, 2023 13:51
@rkapka rkapka dismissed their stale review January 30, 2023 22:20

my mistake

Copy link
Collaborator

@rolfyone rolfyone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as i can tell this was approved. marking so.

@rolfyone rolfyone merged commit 3945797 into ethereum:master Jan 31, 2023
@arnetheduck
Copy link
Contributor

Is there any implementation of this PR? Any test cases?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.