Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(lib/trie): atomic tracked merkle values #2876

Merged
merged 3 commits into from
Oct 12, 2022

Conversation

qdm12
Copy link
Contributor

@qdm12 qdm12 commented Oct 6, 2022

Changes

  • Create local pendingDeletedMerkleValues set and inject it from any exported trie methods to unexported functions.
  • Move values from pendingDeletedMerkleValues to the trie deletedMerkleValues field if the exported method fully succeeds

For now this really only affects LoadFromMap if it fails after the first trie insertion.

However, this is QUITE IMPORTANT for lazy loading where most functions will now have to handle an error returned, and could fail at any step.

Tests

go test -tags integration github.com/ChainSafe/gossamer/lib/trie/...

Issues

Related to #2838

Primary Reviewer

@edwardmack

@codecov
Copy link

codecov bot commented Oct 6, 2022

Codecov Report

Merging #2876 (2c0d782) into development (17517ea) will increase coverage by 0.12%.
The diff coverage is 100.00%.

Additional details and impacted files
@@               Coverage Diff               @@
##           development    #2876      +/-   ##
===============================================
+ Coverage        63.10%   63.22%   +0.12%     
===============================================
  Files              219      219              
  Lines            27357    27397      +40     
===============================================
+ Hits             17264    17323      +59     
+ Misses            8500     8480      -20     
- Partials          1593     1594       +1     

lib/trie/trie.go Outdated Show resolved Hide resolved
lib/trie/trie.go Outdated Show resolved Hide resolved
lib/trie/trie.go Outdated Show resolved Hide resolved
lib/trie/trie.go Outdated Show resolved Hide resolved
lib/trie/trie.go Outdated Show resolved Hide resolved
lib/trie/trie.go Outdated
@@ -515,6 +538,10 @@ func (t *Trie) insertInBranch(parentBranch *Node, key, value []byte) (
// The keys are in hexadecimal little Endian encoding and the values
// are hexadecimal encoded.
func (t *Trie) LoadFromMap(data map[string]string) (err error) {
defer func() {
t.handleTrackedDeltas(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how err is every set in this. Won't it always be nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it will capture the returned named error, so if any return xxx happens it will get caught in that deferred function

lib/trie/trie.go Outdated
Comment on lines 83 to 87
if err == nil {
for merkleValue := range t.newDeletedMerkleValues {
t.deletedMerkleValues[merkleValue] = struct{}{}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems really awkward. Could you not just call this whenever err == nil, and explicitly reset the t.newDeletedMerkleValues.

Copy link
Contributor Author

@qdm12 qdm12 Oct 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I get your point, although this function is about to be stuffed with more logic to track inserted merkle values and remove re-inserted merkle values too. Maybe even more stuff with lazy loading. I would rather keep this function as a defer call on exported trie methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the function to take in a success bool which is set to err == nil in the caller (or just true for now, until next branch)

lib/trie/trie.go Outdated
// in progress. Its goal is to prevent having the trie `deletedMerkleValues`
// in a bad state if one of the trie operation modified some (deep copied) trie
// node and then failed.
newDeletedMerkleValues map[string]struct{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a need for this? Could you not just lock and unlock the trie so it will never be in a bad state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is about tracking deleted merkle values in the trie since the last trie snapshot, not the actual trie data.
The trie data is unaffected if an error occurs (since we don't update the root node of the trie), but those tracked merkle values would be wrongly tracked in case of an error occuring in the middle of an operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think every trie method should be transactional; the transaction mechanism in the state (based on the trie snapshot) allows to run multiple method calls and rollback (the trie data for now, not the tracked stuff, although this could be arranged).

However, there is a lot of trie calls without snapshotting the trie, and in that case, if the operation fails in the middle (for now it's only LoadFromMap, but soon all operations may fail at any time during their recursion), the trie will be left in a bad state.

Finally - that came up today - I actually need to track the deleted (and inserted) Merkle values during a single operation to track the inserted Merkle values correctly (i.e. a temporarily inserted node which then gets thrown out in the same method call should not be tracked as inserted).

Copy link
Contributor

@jimjbrettj jimjbrettj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit, but looks good to me

lib/trie/trie.go Outdated Show resolved Hide resolved
@qdm12 qdm12 force-pushed the qdm12/trie/transactional-deleted branch from e89b1cc to c2910f8 Compare October 11, 2022 11:25
@qdm12 qdm12 force-pushed the qdm12/trie/transactional-deleted branch from c2910f8 to 2e7acd3 Compare October 11, 2022 11:54
@qdm12
Copy link
Contributor Author

qdm12 commented Oct 11, 2022

@jimjbrettj @EclesioMeloJunior I changed this to inject a pendingDeletedMerkleValues to the (unexported) functions. This feels a bit better / more explicit, although it does change more function signatures than before. What do you think? 🤔

@qdm12
Copy link
Contributor Author

qdm12 commented Oct 12, 2022

@timwu20 merging this since I kind of need this badly now in a few other branches (and rebasing/changing PR target branch for the other 4 branches is rather time consuming). But feel free to comment back on the conversations which I hope would somehow be resolved, and I can address this in another PR.

@qdm12 qdm12 merged commit 1c4174c into development Oct 12, 2022
@qdm12 qdm12 deleted the qdm12/trie/transactional-deleted branch October 12, 2022 10:16
@qdm12
Copy link
Contributor Author

qdm12 commented Oct 13, 2022

Oops forgot to add a unit test for handleTrackedDeltas, added in #2873

github-actions bot pushed a commit that referenced this pull request Nov 23, 2022
# [0.7.0](v0.6.0...v0.7.0) (2022-11-23)

### Bug Fixes

* **chain:** update ed25519 addresses in dev/gssmr genesis files ([#2225](#2225)) ([5f47d8b](5f47d8b))
* **ci:** caching of Go caches ([#2451](#2451)) ([ce3c10c](ce3c10c))
* **ci:** codecov.yml configuration ([#2698](#2698)) ([d4fc383](d4fc383))
* **ci:** comment skip code for required workflows ([#2312](#2312)) ([45dce9b](45dce9b))
* **ci:** copyright workflow to exit if different files ([#2487](#2487)) ([89c32ae](89c32ae))
* **ci:** deepsource toml configuration ([#2744](#2744)) ([86a70de](86a70de))
* **ci:** embed v0.9.20 runtime, update test suite, and ci workflows ([#2543](#2543)) ([0fff418](0fff418)), closes [#2419](#2419) [#2561](#2561) [#2572](#2572) [#2581](#2581) [#2671](#2671)
* **ci:** fix staging Dockerfile ([#2474](#2474)) ([ae04b80](ae04b80))
* **ci:** mocks checking fixes ([#2274](#2274)) ([d1308e0](d1308e0))
* **ci:** run devnet module unit tests ([#2756](#2756)) ([f635c59](f635c59))
* **ci:** run golangci-lint on integration tests ([#2275](#2275)) ([3ae3401](3ae3401))
* **cmd:** allow --genesis flag to be passed to base command ([#2427](#2427)) ([7f5b5aa](7f5b5aa))
* **cmd:** avoid nil pointer dereference ([#2578](#2578)) ([f2cdfea](f2cdfea))
* **config:** temporary fix for pprof enabled setting precedence ([#2786](#2786)) ([d4d6262](d4d6262))
* **core:** fix txn pool for latest runtime ([#2809](#2809)) ([1551e66](1551e66))
* **deps:** upgrade chaindb to remove badger logs ([#2738](#2738)) ([e0c5706](e0c5706))
* **devnet:** Fix build workflow for devnet ([#2125](#2125)) ([0375fc2](0375fc2))
* **Dockerfile:** remove script entrypoint ([#2707](#2707)) ([abd161b](abd161b))
* **dot/core:** `RuntimeInstance` interface `Version` signature ([#2783](#2783)) ([7d66ec0](7d66ec0))
* **dot/core:** fix the race condition in TrieState ([#2499](#2499)) ([804069c](804069c)), closes [#2402](#2402)
* **dot/digest:** BABE NextEpochData and NextConfigData should be set on finalization ([#2339](#2339)) ([e991cc8](e991cc8))
* **dot/digest:** verify if next epoch already contains some definition ([#2472](#2472)) ([a2ac6c2](a2ac6c2))
* **dot/netwok:** check for duplicate message earlier ([#2435](#2435)) ([d62503f](d62503f))
* **dot/network:** change BlockRequestMessage number from uint64 to uint32 ([8105cd4](8105cd4))
* **dot/network:** close notifications streams ([#2093](#2093)) ([de6e7c9](de6e7c9)), closes [#2046](#2046)
* **dot/network:** fixing errMissingHandshakeMutex ([#2303](#2303)) ([eb07a53](eb07a53))
* **dot/network:** memory improvement for network buffers ([#2233](#2233)) ([fd9b70d](fd9b70d))
* **dot/network:** public IP address logging ([#2140](#2140)) ([9e21587](9e21587))
* **dot/network:** re-add nil mutex check for disconnected peer ([#2408](#2408)) ([9b39bd1](9b39bd1))
* **dot/network:** remove `defer cancel()` inside loop ([#2248](#2248)) ([9e360a5](9e360a5))
* **dot/network:** resize bytes slice buffer if needed ([#2291](#2291)) ([8db8b2a](8db8b2a))
* **dot/peerset:** fix sending on closed channel race condition when dropping peer ([#2573](#2573)) ([2fa5d8a](2fa5d8a))
* **dot/peerset:** remove race conditions from `peerset` package ([#2267](#2267)) ([df09d45](df09d45))
* **dot/rpc/modules:** grandpa.proveFinality update parameters, fix bug ([#2576](#2576)) ([e7749cf](e7749cf))
* **dot/rpc/modules:** rpc.state.queryStorage fixed ([#2565](#2565)) ([1ec0d47](1ec0d47))
* **dot/rpc:** include unsafe flags to be considered by RPC layer ([#2483](#2483)) ([3822257](3822257))
* **dot/state/epoch, lib/babe:** enable block production through epochs without rely on finalization ([#2593](#2593)) ([a0a1804](a0a1804))
* **dot/state:** actually prune finalized tries from memory ([#2196](#2196)) ([e4bc375](e4bc375))
* **dot/state:** change map of tries implementation to have working garbage collection ([#2206](#2206)) ([fada46b](fada46b))
* **dot/state:** inject mutex protected tries to states ([#2287](#2287)) ([67a9bbb](67a9bbb))
* **dot/subscription:** check websocket message from untrusted data ([#2527](#2527)) ([1f20d98](1f20d98))
* **dot/subscription:** unsafe type casting from untrusted input ([#2529](#2529)) ([1015733](1015733))
* **dot/sync, dot/rpc:** implement HighestBlock ([#2195](#2195)) ([f8d8657](f8d8657))
* **dot/sync:** cleanup logs; don't log case where we fail to get parent while processing ([#2188](#2188)) ([cb360ab](cb360ab))
* **dot/sync:** fix "block with unknown header is ready" error ([#2191](#2191)) ([483466f](483466f))
* **dot/sync:** fix `Test_lockQueue_threadSafety` ([#2605](#2605)) ([223cfbb](223cfbb))
* **dot/sync:** Fix flaky tests `Test_chainSync_logSyncSpeed` and `Test_chainSync_start` ([#2610](#2610)) ([7e1014b](7e1014b))
* **dot/sync:** Gossip `BlockAnnounceMessage` only after  successfully imported ([#2885](#2885)) ([69031a6](69031a6))
* **dot/sync:** remove block announcement in `bootstrap` sync mode ([#2906](#2906)) ([2b4c257](2b4c257))
* **dot/sync:** remove max size limit from ascending block requests ([#2256](#2256)) ([e287d7e](e287d7e))
* **dot/sync:** sync benchmark ([#2234](#2234)) ([2f3aef8](2f3aef8))
* **dot/telemetry:** telemetry hashes to be in the hexadecimal format ([#2194](#2194)) ([9b48106](9b48106))
* **dot:** database close error checks ([#2948](#2948)) ([bdb0eea](bdb0eea))
* **dot:** no error logged for init check ([#2502](#2502)) ([2971325](2971325))
* ensure we convert the `uint` type ([#2626](#2626)) ([792e53f](792e53f))
* fix logger mutex locking in `.New` method ([#2114](#2114)) ([e7207ed](e7207ed))
* **internal/log:** log level `DoNotChange` ([#2672](#2672)) ([0008b59](0008b59))
* **levels-logged:** Fix log levels logging at start ([#2236](#2236)) ([a90a6e0](a90a6e0))
* **lib/babe:** check if authority index is in the `authorities` range ([#2601](#2601)) ([1072888](1072888))
* **lib/babe:** ensure the slot time is correct before build a block ([#2648](#2648)) ([78c03b6](78c03b6))
* **lib/babe:** epoch context error wrapping ([#2484](#2484)) ([c053dea](c053dea))
* **lib/babe:** Unrestricted Loop When Building Blocks (GSR-19) ([#2632](#2632)) ([139ad89](139ad89))
* **lib/blocktree:** reimplement `BestBlockHash` to take into account primary blocks in fork choice rule ([#2254](#2254)) ([1a368e2](1a368e2))
* **lib/grandpa:** avoid spamming round messages ([#2688](#2688)) ([b0042b8](b0042b8))
* **lib/grandpa:** capped number of tracked commit messages ([#2490](#2490)) ([47c23e6](47c23e6))
* **lib/grandpa:** capped number of tracked vote messages ([#2485](#2485)) ([d2ee47e](d2ee47e)), closes [#1531](#1531)
* **lib/grandpa:** check equivocatory votes count ([#2497](#2497)) ([014629d](014629d)), closes [#2401](#2401)
* **lib/grandpa:** Duplicate votes is GRANDPA are counted as equivocatory votes (GSR-11) ([#2624](#2624)) ([422e7b3](422e7b3))
* **lib/grandpa:** Storing Justification Allows Extra Bytes (GSR-13) ([#2618](#2618)) ([0fcde63](0fcde63))
* **lib/grandpa:** update grandpa protocol ID ([#2678](#2678)) ([3be75b2](3be75b2))
* **lib/grandpa:** various finality fixes, improves cross-client finality ([#2368](#2368)) ([c04d185](c04d185))
* **lib/grandpa:** verify equivocatory votes in grandpa justifications ([#2486](#2486)) ([368f8b6](368f8b6))
* **lib/runtime:** avoid caching version in runtime instance ([#2425](#2425)) ([7ab31f0](7ab31f0))
* **lib/runtime:** stub v0.9.17 host API functions ([#2420](#2420)) ([6a7b223](6a7b223))
* **lib/trie:** `handleDeletion` generation propagation ([24c303d](24c303d))
* **lib/trie:** `PopulateMerkleValues` functionality changes and fixes ([#2871](#2871)) ([7131290](7131290))
* **lib/trie:** Check for root in EncodeAndHash ([#2359](#2359)) ([087db89](087db89))
* **lib/trie:** Make sure writing and reading a trie to disk gives the same trie  and cover more store/load child trie related test cases ([#2302](#2302)) ([7cd4118](7cd4118))
* **lib/trie:** prepare trie nodes for mutation only when needed ([#2834](#2834)) ([26868df](26868df))
* **lib/trie:** remove map deletion at `loadProof` ([#2259](#2259)) ([fbd13d2](fbd13d2))
* **lint:** fix issues found by golangcilint 1.47.3 ([#2715](#2715)) ([5765e67](5765e67))
* **mocks:** add missing `//go:generate` for mocks ([#2273](#2273)) ([f4f7465](f4f7465))
* **pprof:** pprofserver flag changed to boolean ([#2205](#2205)) ([be00a69](be00a69))
* **staging:** revise datadog-agent start process ([#2935](#2935)) ([36ce37d](36ce37d))
* **state/epoch:** assign epoch 1 when block number is 0 ([#2592](#2592)) ([e5c8cf5](e5c8cf5))
* **state/grandpa:** track changes across forks ([#2519](#2519)) ([3ab76bc](3ab76bc))
* **tests:** `TestAuthorModule_HasSessionKeys_Integration` ([#2932](#2932)) ([8d809aa](8d809aa))
* **tests:** fix block body regex in `TestChainRPC` ([#2805](#2805)) ([b0680f8](b0680f8))
* **tests:** Fix RFC3339 regex for log unit tests ([9caea2a](9caea2a))
* **tests:** Fix wasmer flaky sorts ([#2643](#2643)) ([7eede9a](7eede9a))
* **tests:** handle node crash during waiting ([#2691](#2691)) ([843bd50](843bd50))
* **tests:** update block body regex in `TestChainRPC` ([#2674](#2674)) ([055e5c3](055e5c3))
* **trie:** decode inline child nodes ([#2369](#2369)) ([9efde47](9efde47))
* **trie:** descendants count for clear prefix ([#2606](#2606)) ([1826896](1826896))
* **trie:** disallow empty byte slice node values ([#2927](#2927)) ([d769d1c](d769d1c))
* **trie:** equality differentiate nil and empty storage values ([#2969](#2969)) ([72a08ec](72a08ec))
* **trie:** no in-memory caching of node encoding ([#2919](#2919)) ([856780b](856780b))
* **trie:** Panic when deleting nonexistent keys from trie (GSR-10) ([#2609](#2609)) ([7886318](7886318))
* **trie:** remove encoding buffers pool ([#2929](#2929)) ([f4074cc](f4074cc))
* **trie:** use cached Merkle values for root hash ([#2943](#2943)) ([ec2549a](ec2549a))
* **trie:** use direct Merkle value for database keys ([#2725](#2725)) ([1a3c3ae](1a3c3ae))
* upgrade auto-generated mocks ([#2910](#2910)) ([a2975a5](a2975a5))
* **wasmer:** error logs for signature verification ([#2752](#2752)) ([363c080](363c080))
* **wasmer:** fix flaky sort in `Test_ext_crypto_sr25519_public_keys_version_1` ([#2607](#2607)) ([c061b35](c061b35))

### Features

* **build:** add `github.com/breml/rootcerts` ([#2695](#2695)) ([c74a5b0](c74a5b0))
* **build:** binary built-in timezone data ([#2697](#2697)) ([fdd5bda](fdd5bda))
* **chain:** use always the raw genesis file ([#2775](#2775)) ([dd2fbc9](dd2fbc9))
* **ci:** update mockery from `2.10` to `2.14` ([#2642](#2642)) ([d2c42b8](d2c42b8))
* **cross-client:** create docker-compose.yml for local devnet  ([#2282](#2282)) ([8abbd87](8abbd87))
* detect chain directory dynamically ([#2292](#2292)) ([85c466c](85c466c))
* **devnet:** add substrate docker images to dockerfile  ([#2263](#2263)) ([b7b2a66](b7b2a66))
* **devnet:** continuous integration `gssmr` devnet on AWS ECS ([#2096](#2096)) ([d096d44](d096d44))
* **docker:** docker-compose.yml to run Gossamer, Prometheus and Grafana ([#2706](#2706)) ([c5dda51](c5dda51))
* **dot/network:** add mismatched genesis peer reporting ([#2265](#2265)) ([a1d7269](a1d7269))
* **dot/state:** `gossamer_storage_tries_cached_total` gauge metric ([#2272](#2272)) ([625cbcf](625cbcf))
* **e2e:** build Gossamer on any test run ([#2608](#2608)) ([f97e0ef](f97e0ef))
* **go:** upgrade Go from 1.17 to 1.18 ([#2379](#2379)) ([d85a1db](d85a1db))
* include nested varying data type on neighbor messages ([#2722](#2722)) ([426569a](426569a))
* **lib/babe:** implement secondary slot block production ([#2260](#2260)) ([fcb81a3](fcb81a3))
* **lib/runtime:** support Substrate WASM compression ([#2213](#2213)) ([fd60061](fd60061))
* **lib/trie:** atomic tracked merkle values ([#2876](#2876)) ([1c4174c](1c4174c))
* **lib/trie:** clear fields when node is dirty ([#2297](#2297)) ([1162828](1162828))
* **lib/trie:** only copy nodes when mutation is certain ([#2352](#2352)) ([86624cf](86624cf))
* **lib/trie:** opportunistic parallel hashing ([#2081](#2081)) ([790dfb5](790dfb5))
* **metrics:** replace metrics port with address (breaking change) ([#2382](#2382)) ([d2ec68d](d2ec68d))
* **pkg/scale:** add `Encoder` with `Encode` method ([#2741](#2741)) ([af5c63f](af5c63f))
* **pkg/scale:** add use of pkg/error Wrap for error handling ([#2708](#2708)) ([08c4281](08c4281))
* **pkg/scale:** encoding and decoding of maps in scale ([#2894](#2894)) ([405db51](405db51)), closes [#2796](#2796)
* **pkg/scale:** support for custom `VaryingDataType` types ([#2612](#2612)) ([914a747](914a747))
* remove uneeded runtime prefix logs ([#2110](#2110)) ([8bd05d1](8bd05d1))
* remove unused code ([#2677](#2677)) ([b3698d7](b3698d7))
* **scale:** add range checks to decodeUint function ([#2683](#2683)) ([ac700f8](ac700f8))
* **trie:** decode all inlined node variants ([#2611](#2611)) ([b09eb07](b09eb07))
* **trie:** export `LoadFromProof` ([#2455](#2455)) ([0b4f33d](0b4f33d))
* **trie:** faster header decoding ([#2649](#2649)) ([d9460e3](d9460e3))
* **trie:** finer deep copy of nodes ([#2384](#2384)) ([bd6d8e4](bd6d8e4))
* **trie:** tracking of number of descendant nodes for each node ([#2378](#2378)) ([dfcdd3c](dfcdd3c))
* **trie:** use scale encoder ([#2930](#2930)) ([e3dc108](e3dc108))
* **wasmer/crypto:** move sig verifier to crypto pkg ([#2057](#2057)) ([dc8bbef](dc8bbef))
* **wasmer:** Add `SetTestVersion` method to `Config` struct ([#2823](#2823)) ([e5c9336](e5c9336))
* **wasmer:** get and cache state version in instance context ([#2747](#2747)) ([3fd63db](3fd63db))
@github-actions
Copy link

🎉 This PR is included in version 0.7.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants