-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: fixup fvm IPLD flush logic #1810
Conversation
In preparation for reachability analysis (we're going to re-use this same code). - Make it less generic (performance). - Remove blocks from the write buffer as we write them to avoid duplicate writes. - Simplify some of the checks around what is allowed. For example, I'm now allowing CBOR + Identity hash which should have been allowed previously but wasn't (we don't use it but still, it should have been allowed). - Remove the explicit 100 byte CID length check. The `Cid` type already validates that the digest can be no longer than 64 bytes. - Be less strict on DagCBOR validation. Counter-intuitively, being overly strict here is dangerous as it gives us more points where implementations can disagree and fork. Instead, we enforce the following rules for DAG_CBOR: 1. Blocks must have a valid CBOR structure, but _values_ aren't validated. E.g., no utf-8 validation, no float validation, no minimum encoding requirements, no canonical ordering requirements, etc. 2. All CBOR values tagged with 42 must be valid CIDs. I.e., a CBOR byte string starting with a 0x0 byte followed by a valid CID with at most a 64 byte digest.
// The actual CID is expected to be a byte string | ||
if maj != 2 { | ||
return Err(anyhow!("expected cbor type byte string in input")); | ||
} | ||
if extra > 100 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was unnecessary. If the CID is invalid and/or too big, Cid::try_from
will catch that.
match k.codec() { | ||
// We ignore piece commitment CIDs. | ||
FIL_COMMITMENT_UNSEALED | FIL_COMMITMENT_SEALED => continue, | ||
// We allow raw, cbor, and dag cbor. | ||
IPLD_RAW | DAG_CBOR | CBOR => (), | ||
// Everything else is rejected. | ||
codec => return Err(anyhow!("cid {k} has unexpected codec ({codec})")), | ||
} | ||
// Ignore commitments (not even going to check the hash function. | ||
(FIL_COMMITMENT_UNSEALED | FIL_COMMITMENT_SEALED, _, _) => return Ok(()), | ||
// Fail on anything else. We usually want to continue on error, but there's really no going | ||
// back from here. | ||
(codec, hash, length) => { | ||
return Err(anyhow!( | ||
"cid {root} has unexpected codec ({codec}), hash ({hash}), or length ({length})" | ||
)) | ||
// Check the hash construction. | ||
match (k.hash().code(), k.hash().size()) { | ||
// Allow non-truncated blake2b-256 and identity hashes. | ||
(BLAKE2B_256, BLAKE2B_LEN) | (IDENTITY, _) => (), | ||
// Reject everything else. | ||
(hash, length) => { | ||
return Err(anyhow!( | ||
"cid {k} has unexpected multihash (code={hash}, len={length})" | ||
)) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous logic was shorter but hard to reason about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is more readable
|
||
Ok(()) | ||
self.base | ||
.put_many_keyed(take_reachable(&mut self.write.borrow_mut(), root)?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this to "take" blocks so we wouldn't end up writing them multiple times. The alternative would have been to keep some form of "written" set.
For some history, we used to clear the entire buffer on flush. However, that ended up causing some issues if we, e.g., wanted to flush some intermediate state. Now we take a middle-path of deleting the written blocks but leaving everything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this then end up saving disk space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lotus checks if we have a block before writing it, so it shouldn't matter from a disk-space perspective. However, it should save some time:
- If we have to flush multiple times. Although I think we only do this in some tools/tests.
- If we have identical subtrees and/or common blocks. I could have solved that by keeping a temporary "seen" set while flushing instead of actually removing the flushed blocks from the write buffer, but this way was faster/simpler.
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #1810 +/- ##
==========================================
+ Coverage 75.20% 75.31% +0.11%
==========================================
Files 149 149
Lines 14593 14558 -35
==========================================
- Hits 10975 10965 -10
+ Misses 3618 3593 -25
|
match k.codec() { | ||
// We ignore piece commitment CIDs. | ||
FIL_COMMITMENT_UNSEALED | FIL_COMMITMENT_SEALED => continue, | ||
// We allow raw, cbor, and dag cbor. | ||
IPLD_RAW | DAG_CBOR | CBOR => (), | ||
// Everything else is rejected. | ||
codec => return Err(anyhow!("cid {k} has unexpected codec ({codec})")), | ||
} | ||
// Ignore commitments (not even going to check the hash function. | ||
(FIL_COMMITMENT_UNSEALED | FIL_COMMITMENT_SEALED, _, _) => return Ok(()), | ||
// Fail on anything else. We usually want to continue on error, but there's really no going | ||
// back from here. | ||
(codec, hash, length) => { | ||
return Err(anyhow!( | ||
"cid {root} has unexpected codec ({codec}), hash ({hash}), or length ({length})" | ||
)) | ||
// Check the hash construction. | ||
match (k.hash().code(), k.hash().size()) { | ||
// Allow non-truncated blake2b-256 and identity hashes. | ||
(BLAKE2B_256, BLAKE2B_LEN) | (IDENTITY, _) => (), | ||
// Reject everything else. | ||
(hash, length) => { | ||
return Err(anyhow!( | ||
"cid {k} has unexpected multihash (code={hash}, len={length})" | ||
)) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is more readable
let val = match low { | ||
..=23 => low.into(), | ||
24 => read_fixed::<1>(br)?[0].into(), | ||
25 => u16::from_be_bytes(read_fixed(br)?).into(), | ||
26 => u32::from_be_bytes(read_fixed(br)?).into(), | ||
27 => u64::from_be_bytes(read_fixed(br)?), | ||
_ => return Err(anyhow!("invalid header cbor_read_header_buf")), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great refactor for improving readability!
Regarding performance, IIUC this is now doing 2 extra allocations for each call now instead of using a scratch buffer as before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stack allocations so they should be free (except zeroing, but that's pretty cheap and rust can likely optimize it away).
|
||
Ok(()) | ||
self.base | ||
.put_many_keyed(take_reachable(&mut self.write.borrow_mut(), root)?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this then end up saving disk space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
In preparation for reachability analysis (we're going to re-use this same code).
Cid
type already validates that the digest can be no longer than 64 bytes.O(blocks)
for blocks already in memory, so there should be no danger of running out of memory.Working towards https://github.com/filecoin-project/fvm-pm/issues/715