Sequence numbers commit data for Lucene uses Iterable interface #20793

abeyad · 2016-10-07T03:42:40Z

Sequence number related data (maximum sequence number, local checkpoint,
and global checkpoint) gets stored in Lucene on each commit. The logical
place to store this data is on each Lucene commit's user commit data
structure (see IndexWriter#setCommitData and the new version
IndexWriter#setLiveCommitData). However, previously we did not store the
maximum sequence number in the commit data because the commit data got
copied over before the Lucene IndexWriter flushed the documents to segments
in the commit. This meant that between the time that the commit data was
set on the IndexWriter and the time that the IndexWriter completes the commit,
documents with higher sequence numbers could have entered the commit.
Hence, we would use FieldStats on the _seq_no field in the documents to get
the maximum sequence number value, but this suffers the drawback that if the
last sequence number in the commit corresponded to a delete document action,
that sequence number would not show up in FieldStats as there would be no
corresponding document in Lucene.

In Lucene 6.2, commit data was changed to take an Iterable interface, so
that the commit data can be calculated and retrieved after all documents
have been flushed. This commit changes max_seq_no so it is stored in the
commit data instead of being calculated from FieldStats, taking advantage of
the deferred calculation of the max_seq_no through passing an Iterable that
dynamically sets the iterator data.

Relates #10708

Sequence number related data (maximum sequence number, local checkpoint, and global checkpoint) gets stored in Lucene on each commit. The logical place to store this data is on each Lucene commit's user commit data structure (see IndexWriter#setCommitData and the new version IndexWriter#setLiveCommitData). However, previously we did not store the maximum sequence number in the commit data because the commit data got copied over before the Lucene IndexWriter flushed the documents to segments in the commit. This means that between the time that the commit data was set on the IndexWriter and the time that the IndexWriter completes the commit, documents with higher sequence numbers could have entered the commit. Hence, we would use FieldStats on the _seq_no field in the documents to get the maximum sequence number value, but this suffers the drawback that if the last sequence number in the commit corresponded to a delete document action, that sequence number would not show up in FieldStats as there would be no corresponding document in Lucene. In Lucene 6.2, the commit data was changed to take an Iterable interface, so that the commit data can be calculated and retrieved *after* all documents have been flushed, while the commit data itself is being set on the Lucene commit. This commit changes max_seq_no so it is stored in the commit data instead of being calculated from FieldStats, taking advantage of the deferred calculation of the max_seq_no through passing an Iterable that dynamically sets the iterator data.

bleskes

Thx @abeyad . I left some minor comments. Can we increase the test to cover delete operations?

bleskes · 2016-10-07T06:50:54Z

core/src/main/java/org/elasticsearch/index/engine/InternalEngine.java

+                 * all documents, we defer computation of the max_seq_no to the time of invocation of the commit
+                 * data iterator (which occurs after all documents have been flushed to Lucene).
+                 */
+                final Map<String, String> deferredCommitData = new HashMap<>(commitData.size() + 1);


I think it will be simpler to just capture the local and global checkpoints before we start and build one map here. This way we don't need two maps.

bleskes · 2016-10-07T06:51:11Z

core/src/main/java/org/elasticsearch/index/engine/InternalEngine.java

@@ -1336,10 +1321,26 @@ private void commitIndexWriter(IndexWriter writer, Translog translog, String syn
            }

            if (logger.isTraceEnabled()) {
-                logger.trace("committing writer with commit data [{}]", commitData);
+                logger.trace("committing writer with commit data (max_seq_no excluded) [{}]", commitData);


can we log this after the commit so we have everything?

will do, in doing this, I noticed there is an issue in how the commit data is set - anytime the iterator() is called, it will recompute the maxSeqNo based on the current value of what the SequenceNumbersService returns. We only want this done once so each subsequent time we call writer.getLiveCommitData(), we get the same value that went into the commit. I'm fixing this and will push up a new commit

…ees)

intertwined with document indexing.

abeyad · 2016-10-07T18:58:21Z

@bleskes I pushed 1d63334 to address your review comments on how to construct the commit data, and to ensure its safe access on subsequent calls to its iterator. I pushed d396c7e to add document deletion to the tests.

bleskes

I left a minor suggestion. Also, I think it will be great to have a test that concurrently indexes and commits and makes sure that with every commit point, all ops below the local checkpoint are present and no ops about the max_seq is present. Feel free to add this test here or in a follow up change.

bleskes · 2016-10-09T19:18:45Z

core/src/main/java/org/elasticsearch/index/engine/InternalEngine.java

+            writer.setLiveCommitData(new Iterable<Map.Entry<String, String>>() {
+                // save the max seq no the first time its computed, so subsequent iterations don't recompute,
+                // potentially getting a different value
+                private String computedMaxSeqNoEntry = null;


it seems like we need this cashing because we load user data from the index writer. maybe we can use lastCommittedSegmentInfos, which can be read earlier when opening the engine?

@bleskes I agree with using lastCommittedSegmentInfos from the engine as a solution, but my concern here was that basically we have to document and ensure that no one uses IndexWriter#getLiveCommitData or depends on it for accurate information, otherwise the max_seq_no could be different than what is actually stored in the commit. That's why I added the caching part, which does increase complexity. Do you prefer I remove it and just document that we should never call IndexWriter#getLiveCommitData inside ES code?

I think it's OK to not use IndexWriter#getLiveCommitData - as it may not return what's in the last commit. I suspect this is why it was renamed to say live in the name. Is there anything specific you are concerned about?

to Lucene, ensuring the sequence number related commit data in each Lucene commit point matches the invariants of localCheckpoint <= highest sequence number in commit <= maxSeqNo

abeyad · 2016-10-12T14:05:56Z

@bleskes I pushed 2656776 to remove caching from the iterator and 27532b1 adds a concurrent writes/commit test (and fixes another test)

bleskes

Thx @abeyad . This looks great. I left some minor comments around testing.

bleskes · 2016-10-12T14:17:27Z

core/src/main/java/org/elasticsearch/index/engine/InternalEngine.java

+                }
+                commitData.put(MAX_SEQ_NO, Long.toString(seqNoService().getMaxSeqNo()));
+                if (logger.isTraceEnabled()) {
+                    logger.trace("committed writer with commit data [{}]", commitData);


nit: should be "commit_ting_ writer with commit data"

bleskes · 2016-10-12T14:29:55Z