-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add simple test of pageserver recovery after crash. #1324
Conversation
IMHO we should always build with failpoints enabled. There's little harm in compiling them in, we'll just not use them in production. |
I am not sure. Documentation of failpoints says:
So if we enable this feature, then failpoint! macro will generate some code which (I assume) will check if there are some actions are associated with this failpoint. I do not know how expensive this check is (most likely hash table lookup, but protected by some mutex), but as far as failpoints may be inserted in performance critical parts of the code, I prefer not to enable them in production. At least without measuring first performance penalty introduced by failpoints. I will check it tomorrow. |
Hmm, looking at the code, it acquires an RwLock in read mode, and then a hash table lookup. Yeah, I wouldn't want to put that into any performance-senstive loop. I wish it had a fast-path for the case that no failpoints are set, by checking atomic variable first or something... The failpoints included here are not performance sensitive, so we could always enable them for now. And figure out how to make it cheaper later, if we want to put a failpoint in a more critical path. |
We could use conditional compilation + cmdline flag (say, [dependencies]
fail = "*"
[features]
failpoints = ["fail/failpoints"] fn main() {
let features: &[&str] = &[
#[cfg(feature = "failpoints")]
"failpoints",
];
// TODO: hide this print behind a cmdline flag
println!("available features: {:?}", features);
} Also, I'd throw a hard error instead of warning. Nobody reads warnings if they don't abort CI anyway. |
8042ea3
to
08dc656
Compare
08dc656
to
f7453c5
Compare
f7453c5
to
5eb05f3
Compare
@funbringer , rebased branch didn't compile with some your changes.
|
@lubennikovaav Thanks! I guess you could drop this piece of code altogether. I forgot to remove it after testing. |
5eb05f3
to
e9b261d
Compare
5ca8e6f
to
95ecd98
Compare
I'm doing some fail points related stuff in #1571 I can extract it to merge it faster if it'll help |
… use failpoints in checkpointer
95ecd98
to
f7877e8
Compare
To cause a crash, use failpoints in checkpointer
Rebased version of #1043 by @knizhnik
This test requires pageserver built with
cargo build --features fail/failpoints
, otherwise it will hang.Is there a way to check it from pytest and throw a warning?