Handle interrupt gracefully for Lucene mutable index #11558

itschrispeck · 2023-09-11T04:33:44Z

This addresses #11402.

When a thread is interrupted when searching the consuming segment, the underlying FSDirectory used by the IndexWriter which the SearcherManager was created with can be corrupted. To ensure the index is never corrupted this PR changes the the search to be executed in a child thread and the interrupt is handled in the current thread by canceling the search gracefully.

This was tested on an internal cluster, query timeouts no longer cause massive numbers of 'failed to index' logs and exceptions reported in the aforementioned issue are no longer present.

Some additional background on this behavior is here: apache/lucene#9309

tags: bugfix

codecov-commenter · 2023-09-11T05:18:46Z

Codecov Report

Merging #11558 (f18eccf) into master (032a5cc) will decrease coverage by 48.58%.
Report is 14 commits behind head on master.
The diff coverage is 0.00%.

@@              Coverage Diff              @@
##             master   #11558       +/-   ##
=============================================
- Coverage     63.07%   14.49%   -48.58%     
+ Complexity     1108      201      -907     
=============================================
  Files          2325     2327        +2     
  Lines        124772   125001      +229     
  Branches      19052    19079       +27     
=============================================
- Hits          78694    18124    -60570     
- Misses        40480   105343    +64863     
+ Partials       5598     1534     -4064

Flag	Coverage Δ
integration	`0.00% <0.00%> (-0.01%)`	⬇️
integration1	`?`
integration2	`0.00% <0.00%> (ø)`
java-11	`14.49% <0.00%> (-48.54%)`	⬇️
java-17	`14.49% <0.00%> (-48.42%)`	⬇️
java-20	`14.49% <0.00%> (-48.43%)`	⬇️
temurin	`14.49% <0.00%> (-48.58%)`	⬇️
unittests	`14.49% <0.00%> (-48.57%)`	⬇️
unittests1	`?`
unittests2	`14.49% <0.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed	Coverage Δ
...pl/invertedindex/RealtimeLuceneDocIdCollector.java	`0.00% <0.00%> (-100.00%)`	⬇️
...me/impl/invertedindex/RealtimeLuceneTextIndex.java	`0.00% <0.00%> (-66.67%)`	⬇️
...rtedindex/RealtimeLuceneTextIndexSearcherPool.java	`0.00% <0.00%> (ø)`
.../pinot/server/starter/helix/BaseServerStarter.java	`0.00% <0.00%> (ø)`

... and 1505 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

...java/org/apache/pinot/segment/local/realtime/impl/invertedindex/RealtimeLuceneTextIndex.java

siddharthteotia · 2023-09-12T08:03:12Z

...che/pinot/segment/local/realtime/impl/invertedindex/RealtimeLuceneTextIndexSearcherPool.java

+/**
+ * This class manages a thread pool used for searching over realtime Lucene segments by {@link RealtimeLuceneTextIndex}.
+ * The pool max size is equivalent to pinot.query.scheduler.query_worker_threads to ensure each worker thread can have
+ * an accompanying Lucene searcher thread if needed. init() is called in BaseServerStarter to avoid creating a


Why not just delegate to worker thread pool itself ?

I doubt if this approach actually ensures much or provides a benefit since due to context switching (and the fact that number of worker threads is 2x the number of cores based on the current code), having a searcher thread for each corresponding worker thread may not lead to the benefit we are looking for.

this might be due to my unfamiliarity with the code, but I didn't find the worker thread pool to be accessible from RealtimeLuceneTextIndex. Could you provide some suggestion on how to delegate?

If we use worker thread pool then we would have two types of threads in the thread-pool: one that waits on the future and the other that actually runs the future. (correct me if I am wrong)

That could lead to deadlock kind of scenarios when we have many concurrent queries. We ran into similar issues with Multistage engine in the past.

I don't think this standard coding pattern of outer / parent thread waiting on future completion and child thread actually doing the future / work should lead to deadlock unless we have done something in the code for them to compete for same resource.

In any case, this isn't a blocker. My main concern was primarily that we may be starting too many threads and it might just have just general system performance impacts. If the deadlock concern is real, then may be this is fine.

Within LinkedIn, we don't currently use text search in realtime so can't speak to it for now.

Deadlock can very easily happen if numConcurrentQueries = workerThreadPoolSize/2. In that case all threads in the thread-pool will be utilized, and the new queries would get submitted again to the thread pool and they'll be at the head of the queue. These new queries will then get blocked on the future once some threads finish in the thread-pool, and the future wouldn't complete until some other query finishes. So for each such query we'll have one thread needlessly waiting on resources to be freed even though we have capacity to run them.

For Multistage engine this was a critical fix (I think we had a ~5-10% reliability improvement). With shared threads we were hitting this issue once every few hours. (my memory might not be correct on the exact improvement number)

Edit: It's not technically a deadlock. I use the term loosely but I guess you get the idea. We would end up underutilizing CPU cores and query latency would increase/timeout.

...java/org/apache/pinot/segment/local/realtime/impl/invertedindex/RealtimeLuceneTextIndex.java

...org/apache/pinot/segment/local/realtime/impl/invertedindex/RealtimeLuceneDocIdCollector.java

siddharthteotia · 2023-09-12T08:12:19Z

...java/org/apache/pinot/segment/local/realtime/impl/invertedindex/RealtimeLuceneTextIndex.java

+    try {
+      return searchFuture.get();
+    } catch (InterruptedException e) {
+      docIDCollector.markShouldCancel();


Just for clarification - does doing this prevent index corruption like mentioned in PR description ? This is still in the context of searcher thread that got interrupted right ?

The IE being caught is from the parent thread getting interrupted (not the searcher thread) while waiting for the searchFuture to be done

siddharthteotia · 2023-09-14T20:17:43Z

...che/pinot/segment/local/realtime/impl/invertedindex/RealtimeLuceneTextIndexSearcherPool.java

+  private static ExecutorService _executorService;
+
+  private RealtimeLuceneTextIndexSearcherPool(int size) {
+    _executorService = new ThreadPoolExecutor(0, size, 500, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<>());


Just to confirm, the ThreadPoolExecutor is not going to create any thread (and keep them available) before submission of search tasks ?

The reason being that if that happens then there may be some perf impact of creating 4x numCore threads (2x coming from worker pool and 2x coming from this new pool).

Yes, exactly. Threads are created as needed (up to numWorkerThreads), and kept alive for 500ms once idle to prevent re-instantiation overhead during for high QPS use cases

siddharthteotia

Overall LGTM. Let's just ensure that the new pool won't spin up unnecessary threads

itschrispeck · 2023-09-20T06:09:59Z

@siddharthteotia re: spinning up threads, see this comment

Let me know if there's any other concerns, or could you help merge this?

ankitsultana reviewed Sep 11, 2023

View reviewed changes

itschrispeck added 2 commits September 11, 2023 14:31

handle interrupt for lucene mutable index

9735a75

address comments, use thread pool

ec97d37

itschrispeck force-pushed the lucene_realtime_searcher_fix branch from 7fe7984 to ec97d37 Compare September 12, 2023 05:25

itschrispeck added 2 commits September 11, 2023 23:04

update missed test

60a7130

init pool prior to state transition

dc96a0f

siddharthteotia reviewed Sep 12, 2023

View reviewed changes

...java/org/apache/pinot/segment/local/realtime/impl/invertedindex/RealtimeLuceneTextIndex.java Outdated Show resolved Hide resolved

siddharthteotia reviewed Sep 12, 2023

View reviewed changes

...org/apache/pinot/segment/local/realtime/impl/invertedindex/RealtimeLuceneDocIdCollector.java Outdated Show resolved Hide resolved

siddharthteotia reviewed Sep 12, 2023

View reviewed changes

use clearer terminology

f18eccf

siddharthteotia reviewed Sep 14, 2023

View reviewed changes

siddharthteotia approved these changes Sep 14, 2023

View reviewed changes

Jackie-Jiang added the bugfix label Sep 17, 2023

siddharthteotia merged commit 9bcbef8 into apache:master Sep 20, 2023
21 checks passed

itschrispeck mentioned this pull request Jan 16, 2024

add ScalingThreadPoolExecutor and use for realtime Lucene thread pool #12274

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle interrupt gracefully for Lucene mutable index #11558

Handle interrupt gracefully for Lucene mutable index #11558

itschrispeck commented Sep 11, 2023

codecov-commenter commented Sep 11, 2023 •

edited

Loading

siddharthteotia Sep 12, 2023

itschrispeck Sep 12, 2023

ankitsultana Sep 12, 2023

siddharthteotia Sep 14, 2023 •

edited

Loading

ankitsultana Sep 14, 2023 •

edited

Loading

siddharthteotia Sep 12, 2023

itschrispeck Sep 12, 2023

siddharthteotia Sep 14, 2023

itschrispeck Sep 14, 2023 •

edited

Loading

siddharthteotia left a comment

itschrispeck commented Sep 20, 2023

Handle interrupt gracefully for Lucene mutable index #11558

Handle interrupt gracefully for Lucene mutable index #11558

Conversation

itschrispeck commented Sep 11, 2023

codecov-commenter commented Sep 11, 2023 • edited Loading

Codecov Report

siddharthteotia Sep 12, 2023

Choose a reason for hiding this comment

itschrispeck Sep 12, 2023

Choose a reason for hiding this comment

ankitsultana Sep 12, 2023

Choose a reason for hiding this comment

siddharthteotia Sep 14, 2023 • edited Loading

Choose a reason for hiding this comment

ankitsultana Sep 14, 2023 • edited Loading

Choose a reason for hiding this comment

siddharthteotia Sep 12, 2023

Choose a reason for hiding this comment

itschrispeck Sep 12, 2023

Choose a reason for hiding this comment

siddharthteotia Sep 14, 2023

Choose a reason for hiding this comment

itschrispeck Sep 14, 2023 • edited Loading

Choose a reason for hiding this comment

siddharthteotia left a comment

Choose a reason for hiding this comment

itschrispeck commented Sep 20, 2023

codecov-commenter commented Sep 11, 2023 •

edited

Loading

siddharthteotia Sep 14, 2023 •

edited

Loading

ankitsultana Sep 14, 2023 •

edited

Loading

itschrispeck Sep 14, 2023 •

edited

Loading