-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: migration("re-indexing"), backfilling and diasgnostics tooling for the ChainIndexer
#12450
base: feat/msg-eth-tx-index
Are you sure you want to change the base?
feat: migration("re-indexing"), backfilling and diasgnostics tooling for the ChainIndexer
#12450
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the PR title to match https://github.com/filecoin-project/lotus/blob/master/CONTRIBUTING.md#pr-title-conventions
chainindex/api.go
Outdated
return &types.IndexValidation{ | ||
TipsetKey: ts.Key().String(), | ||
Height: uint64(ts.Height()), | ||
Backfilled: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fill out the other fields here using SQL queries.
@rvagg Would be great to have a first round of review here when you get the time. |
ChainIndexer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the PR title to match https://github.com/filecoin-project/lotus/blob/master/CONTRIBUTING.md#pr-title-conventions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the PR title to match https://github.com/filecoin-project/lotus/blob/master/CONTRIBUTING.md#pr-title-conventions
ChainIndexer
ChainIndexer
ChainIndexer
ChainIndexer
ChainIndexer
ChainIndexer
chain/index/indexer.go
Outdated
@@ -220,6 +233,9 @@ func (si *SqliteIndexer) Apply(ctx context.Context, from, to *types.TipSet) erro | |||
} | |||
si.closeLk.RUnlock() | |||
|
|||
si.writerLk.Lock() | |||
defer si.writerLk.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe unlock before notifyUpdateSubs
?
chain/index/indexer.go
Outdated
@@ -337,6 +353,9 @@ func (si *SqliteIndexer) Revert(ctx context.Context, from, to *types.TipSet) err | |||
} | |||
si.closeLk.RUnlock() | |||
|
|||
si.writerLk.Lock() | |||
defer si.writerLk.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto on unlocking before notifying subs?
&ps.getMinNonRevertedHeightStmt: "SELECT MIN(height) FROM tipset_message WHERE reverted = 0", | ||
&ps.hasNonRevertedTipsetStmt: "SELECT EXISTS(SELECT 1 FROM tipset_message WHERE tipset_key_cid = ? AND reverted = 0)", | ||
&ps.updateEventsToRevertedStmt: "UPDATE event SET reverted = 1 WHERE message_id IN (SELECT message_id FROM tipset_message WHERE tipset_key_cid = ?)", | ||
&ps.updateEventsToNonRevertedStmt: "UPDATE event SET reverted = 0 WHERE message_id IN (SELECT message_id FROM tipset_message WHERE tipset_key_cid = ?)", | ||
&ps.getMsgIdForMsgCidAndTipsetStmt: "SELECT message_id FROM tipset_message WHERE tipset_key_cid = ? AND message_cid = ? AND reverted = 0", | ||
&ps.insertEventStmt: "INSERT INTO event (message_id, event_index, emitter_addr, reverted) VALUES (?, ?, ?, ?)", | ||
&ps.insertEventStmt: "INSERT INTO event (message_id, event_index, emitter_addr, reverted) VALUES (?, ?, ?, ?) ON CONFLICT (message_id, event_index) DO UPDATE SET reverted = 0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some kind of test that exercises this and demonstrates it working well would be nice, also does the query plan show it behaving nicely? I can imagine this getting a bit weird matching for conflicts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Will be writing a test for this.
-
While sqliite does not generate query plans for
INSERT STATEMENTS
, the query plan for an equivalent query says this will be performant as we have a covering index for(message_id, event_index)
/.lotus-calibnet/sqlite% sqlite3 chainindex.db "EXPLAIN QUERY PLAN
INSERT INTO event (message_id, event_index, emitter_addr, reverted)
SELECT 1, 2, 'a', 0
WHERE NOT EXISTS (
SELECT 1 FROM event
WHERE message_id = 1 AND event_index = 2
);"
QUERY PLAN
|--SCAN CONSTANT ROW
`--SCALAR SUBQUERY 1
`--SEARCH event USING COVERING INDEX sqlite_autoindex_event_1 (message_id=? AND event_index=?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because we have an UNIQUE
constraint on (message_id, event_index)
chain/index/api.go
Outdated
return revertedCount, nonRevertedCount, nil | ||
} | ||
|
||
func (si *SqliteIndexer) ChainValidateIndex(ctx context.Context, epoch abi.ChainEpoch, backfill bool) (*types.IndexValidation, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we did this all in a single transaction, would that remove the need for a write lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, transaction provides atomicity, not exclusivity. Imagine this scenario:
Backfill
is called for epoch e -> it reads tipsetts
at that epochRevert
executes forts
and marks the tipset as reverted inTX1
Backfill
executes forts
and marks it as applied inTX2
TX1
is committed first and then TX2
is comitted.
Basically, it's hard to reason about state here without this lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One work-around here it to only allow backfilling for "safe" epochs i.e. <=head - 30
.
Then, the rewrite can only ever happen in an edge case.
|
||
// fetch the non-reverted tipset at this epoch | ||
var indexedTsKeyCidBytes []byte | ||
err = si.stmts.getNonRevertedTipsetAtHeightStmt.QueryRowContext(ctx, epoch).Scan(&indexedTsKeyCidBytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see the LIMIT 1
below if you're only expecting one result
chain/index/api.go
Outdated
} | ||
|
||
return &types.IndexValidation{ | ||
TipsetKey: expectedTs.Key().String(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see below, I think just the key would be good here, also note capitalisation of TipSetKey
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
chain/index/api.go
Outdated
return &types.IndexValidation{ | ||
TipsetKey: ts.Key().String(), | ||
Height: uint64(ts.Height()), | ||
Backfilled: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: other fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
* feat: finish todos of validation api * feat: add indexed data verification with chain store * feat: address comments and finish TODO * fix: build issue * address comments * fix: ci issue
Thanks for all your work here @akaladarshi 👍 |
Co-authored-by: Rod Vagg <rod@vagg.org>
NonRevertedEventsCount uint64 | ||
Backfilled bool | ||
IndexedMessagesCount uint64 | ||
IndexedEventsCount uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rename this to IndexedNonRevertedMsgCount
and IndexedNonRevertedEventsCount
to be precise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akaladarshi That is an implementation detail that will confuse users. They're only asking for the indexed entries at a specific epoch here and so ofcourse expect only non-reverted tipsets back.
* feat: add lotus-shed command for backfilling chain indexer * feat: add lotus-shed command for inspecting the chain indexer * feat: use single lotus-shed command to inspect and backfill * fix: remove the unused queries * small changes * add change log
@BigLep This is now ready for review. Working on a user guide for this tooling as discussed offline. |
@rvagg This is now in a good shape and has been tested on a calibnet node. Please take a look when you get the time. |
ChainIndexer Migration and Diagnostics Tooling
This PR implements the "migration" (really re-indexing / backfilling), and diagnostics tooling for the
ChainIndexer
implemented in PR #12450, and is part of the work for #12453. This tooling takes the form of both RPC APIs on the daemon andlotus-shed
CLI commands.Re-indexing Process
The re-indexing tool enables clients to index their entire existing ChainState in the
ChainIndexer
. This process is necessary due to the removal of the existingMsgIndex
,EthTxIndex
, andEventIndex
from Lotus.Why Re-index Instead of Migrate?
We've chosen to re-index rather than migrate data from existing indices for two primary reasons:
Instead, we're re-indexing the
Chainstore
/Chainstate
on the node into theChainIndexer
. This ensures that all re-indexed entries have gone through the indexing logic of the newChainIndexer
and that the Index is in sync/reflects the actual contents of theChainstore
/Chainstate
post re-indexing.Diagnostics Tooling
This PR introduces diagnostic tools for detecting corrupt Index entries at specific epochs or epoch ranges.
While this PR implements functionality for optionally backfilling missing Index entries, it does not yet include the capability to "repair" corrupted Indexed entries. The repair functionality will be introduced in a subsequent PR. This approach allows us to first gather and analyze user reports, helping us understand the types and causes of corrupted Indexed entries(and if all they exist in the new
ChainIndexer
) before implementing repair mechanisms.Core API
The fundamental building block for this tooling is the following RPC API:
This API has the following features:
Chainstore
state and the Indexed entries (tipset messages/events)lotus-shed
CLI toolingThe
lotus-shed
CLI tooling for both re-indexing/backfilling and diagnostics can then invoke this RPC API over epoch ranges. The correspondinglotus-shed backfill index [from, to]
andlotus-shed inspect index [from, to]
can then backfill/inspect/diagnose the Index for the given epoch ranges.TODO
lotus-shed
commands