Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use piecewise Iteration for a full table scan. #35

Open
prataprc opened this issue Jul 26, 2017 · 0 comments
Open

Use piecewise Iteration for a full table scan. #35

prataprc opened this issue Jul 26, 2017 · 0 comments

Comments

@prataprc
Copy link
Member

prataprc commented Jul 26, 2017

Full table scan on active tree will have following issues:

  • With MVCC disabled, Iterations will lock the entire tree, until it
    completes.
  • With MVCC enabled, Iterations won't lock the writer and won't
    interfer with other concurrent readers. But if there are hundreds
    and thousands of mutations happening in the background, while the
    iterator is holding the snapshot, it can lead to huge memory
    pressure.

To avoid this, implement a PiecewiseIterator that scans the tree
part by part and stitches them together to simulate a full-table scan.
Here are some implementation guidelines.

  • Can be implemented only when application can maintain a monotonically
    increasing seqno for every mutation (CREATE, UPDATE, DELETE).
  • LLRB should be created with metadata.bornseqno, metadata.deadseqno
    enabled.
  • PiecewiseIterators will repeat a large number of small Iteration on
    the tree until it completes a full table scan.
  • Application should supply the tillseqno as a form of timestamp to
    filter mutations that happens after the point in time that started
    PiecewiseIteration.
  • The first iteration will have startkey as nil and endkey as nil.
  • Each iteration will read only 1000 entries. And the last key will be
    remembered as the startkey for the next iteration with inclusion
    set to high.
  • For every iteration, the entries read from the tree will be filtered
    and returned to the application.
    • Get the node's timestamp by picking the larger value between
      bornseqno and deadseqno.
    • Node's timestamp should be less than or equal to tillseqno. Else
      skip.

The key difference between Iterator and PiecewiseIterator is that
PiecewiseIterator does not give a point time view of the tree snapshot.
For example, if an entry that is not yet read by the piecewise-iterator
is updated after the iteration has started, then the entry might be skipped
and this entry won't be part of the final output.

This implies:

  • The final output won't contain the full sample set of entries.
  • Cannot be queried with correctness.

Hopefully it can still be used with LSM reads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant