Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow applying entries on durable remote quorum #37

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

tbg
Copy link
Collaborator

@tbg tbg commented Mar 17, 2023

Currently, a node will only emit entries for application that are
durably in the log locally.

To provide high availability in the presence of grey failures such as
disk stalls, it is desirable to be able to apply entries that are not
durable locally but are nevertheless known to have been persisted on a
quorum.

This PR prototypes such a mechanism which is opt-in and only available
in conjunction with with async storage writes.

Applying entries that aren't in the local log requires some care:

  • if snapshots are sent based on the applied state, the leader may have
    to catch up the follower from its unstable storage. This is fine. If
    the leader crashes, the next leader will definitely have a durable log
    that will catch up the snapshot recipient thanks to the log completeness
    property. (i.e. the next leader is from the quorum).
  • orchestrating log truncation via a side effect of applying an entry is
    more complicated since the truncation may now be carried out on a
    portion of the log that will only be written in the future.

I haven't thought much about other problems that may occur, since at
present I'm sending this mostly for prototyping and discussion, not
with an intention to merge.

tbg added 2 commits March 17, 2023 17:51
Currently, a node will only emit entries for application that are
durably in the log locally.

To provide high availability in the presence of grey failures such as
disk stalls, it is desirable to be able to apply entries that are not
durable locally but are nevertheless known to have been persisted on a
quorum.

This PR prototypes such a mechanism which is opt-in and only available
in conjunction with with async storage writes.

Applying entries that aren't in the local log requires some care:

- if snapshots are sent based on the applied state, the leader may have
  to catch up the follower from its unstable storage. This is fine. If
the leader crashes, the next leader will definitely have a durable log
that will catch up the snapshot recipient thanks to the log completeness
property. (i.e. the next leader is from the quorum).
- orchestrating log truncation via a side effect of applying an entry is
  more complicated since the truncation may now be carried out on a
portion of the log that will only be written in the future.

I haven't thought much about other problems that may occur, since at
present I'm sending this mostly for prototyping and discussion, not
with an intention to merge.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant