Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add daily scheduled rewards #131

Merged
merged 67 commits into from
Jun 12, 2024
Merged

Add daily scheduled rewards #131

merged 67 commits into from
Jun 12, 2024

Conversation

juliangruber
Copy link
Member

@juliangruber juliangruber commented Jun 6, 2024

  • Expose scheduled rewards
  • Test API
  • Find all participant addresses
  • Observe scheduled rewards

Blocked by #102
For filecoin-station/desktop#1552

@juliangruber juliangruber changed the title Add participant scheduled rewards Add daily scheduled rewards Jun 6, 2024
@juliangruber
Copy link
Member Author

@bajtos do you have an idea how to best test this? I would suggest to pass a mocked IE contract and then query if scheduled rewards have been recorded to the DB.

@juliangruber
Copy link
Member Author

juliangruber commented Jun 7, 2024

Another question is how to get the participant addresses.

Query the contract

When we query the contract, we only get those with scheduled rewards over 0.1 FIL. That is insufficient for plotting the road towards the threshold.

Add to spark-api/evaluate

Two potential sources for participants are spark-api, where we see the addresses of all participants, or spark-evaluate, where we see the addresses of all participants with accepted measurements. I would like not to add the logic needed for this PR to these repos, because to me, they are not related. I would like only to add code to spark-stats, and if we add code to spark-api, it should be generic enough and will let arbitrary services consume this data. No changes to spark-{api,evaluate} should be necessary when in the future we desire to change the semantics of "participants".

But how should these services expose the participants, in a way that is not opinionated? (Eg after which period of inactivity should the participant be removed from the list? Should it be queryable by date? Etc).

Query setScores() calls

Another option is querying the block explorers for setScores() calls, which will also include participant addresses with accepted measurements. This should work similar to listening for transfer events (#102). However, we are dependent on APIs (like Beryx) to expose this data. Unfortunately, the contract itself can't expose it.

Lazy approach

Yet another option is the lazy approach: When a participant's scheduled rewards are requested for the first time, we add that participant to the list. This has the upside of least architectural cost, but the downside that you won't see historic rewards from before you first asked for them (read: opened Station Desktop). This is likely unacceptable.

Consume raw measurements

Another contract based option is to listen for MeasurementsAdded events, then download the raw measurements from w3s, and extract the participant addresses. This should be simple to implement, but is quite clumsy.

Add contract event

Another option is to add an event to the IE smart contract itself, which will post evaluation results to the chain. This comes at the development cost of deploying a new contract.


There are probably other options here too. @bajtos @patrickwoodhead what do you think?

@juliangruber
Copy link
Member Author

juliangruber commented Jun 7, 2024

I think spark-stats should consume the smart contract to get this information. That is the piece that other services should interact with as much as possible, because it lives on the chain and can therefore be trusted.

That means, spark-stats will either:

  • listen for MeasurementsAdded events, download measurements via IPFS, and extract the participant addresses
  • query Beryx for setScores() calls, and extract the participant addresses
  • listen for a new ScoresSet event, and extract the participant addresses, after we deployed a new IE contract with this feature

My thoughts on these options:

  • listen for MeasurementsAdded events, download measurements via IPFS, and extract the participant addresses

Downloading measurements from web3.storage often fails. Atm only spark-evaluate does this, now we would have a second service that can fail because of it. It's a known problem, but also we'd be investing more into it.

It will double our web3.storage egress traffic. I think that shouldn't be an issue, but it's wasteful.

This will give spark-stats the participant addresses of all participants, independent of whether their measurements were accepted or not. I think that shouldn't be an issue.

This option is quite easy to implement, and shouldn't take more than 1 day.

  • query Beryx for setScores() calls, and extract the participant addresses

This couples us to Beryx, if they decide to change their API, we must hope to find another one that works.

I'm currently evaluating whether Beryx can expose this information. I've reached out to Beryx to ask if this could be supported, currently the API fails when trying to get this information.

  • listen for a new ScoresSet event, and extract the participant addresses, after we deployed a new IE contract with this feature

Adding a new event isn't bad, but migrating to a new contract is painful and takes time.

observer/lib/observer.js Outdated Show resolved Hide resolved
Comment on lines 68 to 81
try {
scheduledRewards = await ieContract.rewardsScheduledFor(address)
} catch (err) {
console.error('Error querying scheduled rewards for', address, { cause: err })
continue
}
console.log('Scheduled rewards for', address, scheduledRewards)
await pgPool.query(`
INSERT INTO daily_scheduled_rewards (day, address, scheduled_rewards)
VALUES (now(), $1, $2)
ON CONFLICT (day, address) DO UPDATE SET
scheduled_rewards = EXCLUDED.scheduled_rewards
`, [address, scheduledRewards])
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be very expensive. With 5k daily active participants (see Spark Public Dashboard), we are going to make 5k RPC API calls followed by 5k SQL queries.

(1)
Can we group RPC API calls and SQL queries into batches? I believe Ethers v6 supports request batching, and we are configuring the RPC API provider to use batching. Now, we need to find out how to trigger the batching behaviour.

What I would like to see at high level:

const BATCH_SIZE = 10 // for example, we need to tweak this
for (const addrBatch of splitIntoBatches(dailyParticipantAddresses, BATCH_SIZE)) {
  // is this going to trigger Ethers.js batching behaviour?
  const rewards = await Promise.all(
    addrBatch.map(addr => await ieContract.rewardsScheduledFor(addr))
  )

  // run a single SQL query to update multiple rows
  await pgPool.query(`
    INSERT INTO daily_scheduled_rewards (day, address, scheduled_rewards)
    VALUES (now(), unnest($1::text[]), unnest($2::numeric[]))
    ON CONFLICT (day, address) DO UPDATE SET
    scheduled_rewards = EXCLUDED.scheduled_rewards
  `, [
    addrBatch,
    rewards
  ])
}

(The query is inspired by an existing query in spark-evaluate here).

(2)
We must be mindful of how many requests we send to Glif RPC API. IMO, we shouldn't send all 5k queries as fast as the systems can handle as that would put too much load of the RPC API provider.

I propose to introduce a small delay between iterations of this loop.

If we batch requests for each 10 participants, we need to send ~500 requests. If we use a 1s delay, then we will finish updating all participants in 500 seconds = 8.3 minutes. I think that's fast enough since the scheduled rewards are updated every 20 minutes.

stats/lib/typings.d.ts Outdated Show resolved Hide resolved
@bajtos
Copy link
Member

bajtos commented Jun 9, 2024

@bajtos do you have an idea how to best test this? I would suggest to pass a mocked IE contract and then query if scheduled rewards have been recorded to the DB.

Not really. A mocked IE contract sounds like a good start to me. We already used that approach in voyager-publisher tests IIRC (see filecoin-station/voyager-api#22)

Another question is how to get the participant addresses.

Great description of different options available 👌🏻

I posted my thoughts in #131 (comment) before I read your comments.

I think spark-stats should consume the smart contract to get this information. That is the piece that other services should interact with as much as possible, because it lives on the chain and can therefore be trusted.

💯

Instead of adding setScores event, I propose adding an event when participant's scheduled rewards change. There are two cases when that happens - setScores/ increaseParticipantBalance increases the balance, transferScheduled decreases the balance.

Add to spark-api/evaluate

But how should these services expose the participants, in a way that is not opinionated? (Eg after which period of inactivity should the participant be removed from the list? Should it be queryable by date? Etc).

As I explained in #131 (comment), spark-evaluate maintains the table daily_participants. This table is powering our public dashboard, so I think it's safe to assume it will always interpret the term "participant" correctly.

As for opinions about inactivity & querying by date - I guess the current solution is somewhat opinionated to serve the needs of the dashboard, but I also think it's flexible enough to support additional use cases, e.g. your work in this pull request.

--

Depending on how much time you are willing to spend on this feature, I propose to choose one of the following two ways forward:

  • Least effort: Use the existing table daily_participants. No changes outside of this PR should be needed.

  • Ideal solution: Implement the new smart contract event. It's more work, but the solution may perform better as we can listen to events instead of repeatedly checking the balance of every participant.

@juliangruber
Copy link
Member Author

Awesome, +1 to using daily_participants <3

I agree that adding the event is the right solution, but I don't think this is the right time for this

@juliangruber juliangruber requested a review from bajtos June 12, 2024 13:27
observer/bin/dry-run.js Outdated Show resolved Hide resolved
observer/test/observer.test.js Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
observer/bin/migrate.js Outdated Show resolved Hide resolved
observer/lib/observer.js Show resolved Hide resolved
observer/package.json Outdated Show resolved Hide resolved
stats/lib/typings.d.ts Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Copy link
Member

@bajtos bajtos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@juliangruber juliangruber merged commit e03bcf0 into main Jun 12, 2024
10 checks passed
@juliangruber juliangruber deleted the add/scheduled-rewards branch June 12, 2024 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants