-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactors is_banned logic and forces health check on unban #288
Refactors is_banned logic and forces health check on unban #288
Conversation
src/pool.rs
Outdated
warn!("Unbanning all replicas."); | ||
return false; | ||
write_guard[address.shard].clear(); | ||
drop(write_guard); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I drop it before issuing a warn!
because warn!
is technically IO, so it's pretty slow. In this case though, it may be okay to wait 0.0001s it takes to print something to the screen before unlocking the mutex. If you're dropping it just before a return
, you don't need an explicit drop because it will get dropped as the function returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here is to time the log closer to the actual banning event
src/pool.rs
Outdated
debug!("{:?} is ok", address); | ||
false | ||
} | ||
let now = chrono::offset::Utc::now().naive_utc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be cool to just use std::time
instead of chrono
, since we don't really care about timezones. We assume that the timezone of the machine won't change between invocations of Instant::now()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is being used consistently for all the banning logic, can create a new PR to change this later
src/pool.rs
Outdated
|
||
true | ||
} else { | ||
warn!("{:?} is banned", address); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will spam the log massively on loaded systems, think 4,000 times per second and more. It should be debug imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that's fair point.
let guard = self.banlist.read(); | ||
/// Determines if we can try to unban this server | ||
pub async fn can_unban(&self, address: &Address) -> bool { | ||
// If somehow primary ends up being banned we should return true here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep the "why", i.e. the primary can never and should never be banned.
src/pool.rs
Outdated
return true; | ||
} | ||
|
||
// Check if all instances are banned, in that case unban everything |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Check if all instances are banned, in that case unban everything | |
// Check if all replicas are banned, in that case unban all of them |
let read_guard = self.banlist.read(); | ||
let banned_timestamp = match read_guard[address.shard].get(address) { | ||
Some(timestamp) => timestamp.clone(), | ||
None => return true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The debug here was useful to know that the instance is not banned during development. It would be nice to log address is ok
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this function is not responsible for saying the address is okay since the is_banned function is doing that and logging it
src/pool.rs
Outdated
// Check if ban time is expired | ||
let read_guard = self.banlist.read(); | ||
let banned_timestamp = match read_guard[address.shard].get(address) { | ||
Some(timestamp) => timestamp.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need to clone, you can keep a reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this results in a reference and we want to drop the guard immediately after reading this value. can do the operations I need though before dropping this
src/pool.rs
Outdated
|
||
let guard = self.banlist.read(); | ||
/// Determines if we can try to unban this server | ||
pub async fn can_unban(&self, address: &Address) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right name for this function. can_unban
tells me that you're checking if you're allowed to unban the instance given some condition, and then you can decide whether to do so or not. In this implementation, you're unbanning the instance if you can in the same function. Maybe a better name could be is_unbanned
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to try_unban
Sweet. |
A banned instance will only be banned for the duration of the ban_time setting. If an instance experiences issues longer than the ban time and shorter than the health check delay, it will return to the pool of available instances and clients will connect to the bad replica.
This PR also makes the is_banned function more lightweight (it's used by the admin db) and forces health check after unbanning an instance
Refactor of: #184