-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSQ controller: Support in-memory shuffles; towards JVM reuse. #16168
Merged
Commits on Mar 19, 2024
-
MSQ controller: Support in-memory shuffles; towards JVM reuse.
This patch contains two controller changes that make progress towards a lower-latency MSQ. First, support for in-memory shuffles. The main feature of in-memory shuffles, as far as the controller is concerned, is that they are not fully buffered. That means that whenever a producer stage uses in-memory output, its consumer must run concurrently. The controller determines which stages run concurrently, and when they start and stop. "Leapfrogging" allows any chain of sort-based stages to use in-memory shuffles even if we can only run two stages at once. For example, in a linear chain of stages 0 -> 1 -> 2 where all do sort-based shuffles, we can use in-memory shuffling for each one while only running two at once. (When stage 1 is done reading input and about to start writing its output, we can stop 0 and start 2.) 1) New OutputChannelMode enum attached to WorkOrders that tells workers whether stage output should be in memory (MEMORY), or use local or durable storage. 2) New logic in the ControllerQueryKernel to determine which stages can use in-memory shuffling (ControllerUtils#computeStageGroups) and to launch them at the appropriate time (ControllerQueryKernel#createNewKernels). 3) New "doneReadingInput" method on Controller (passed down to the stage kernels) which allows stages to transition to POST_READING even if they are not gathering statistics. This is important because it enables "leapfrogging" for HASH_LOCAL_SORT shuffles, and for GLOBAL_SORT shuffles with 1 partition. 4) Moved result-reading from ControllerContext#writeReports to new QueryListener interface, which ControllerImpl feeds results to row-by-row while the query is still running. Important so we can read query results from the final stage using an in-memory channel. 5) New class ControllerQueryKernelConfig holds configs that control kernel behavior (such as whether to pipeline, maximum number of concurrent stages, etc). Generated by the ControllerContext. Second, a refactor towards running workers in persistent JVMs that are able to cache data across queries. This is helpful because I believe we'll want to reuse JVMs and cached data for latency reasons. 1) Move creation of WorkerManager and TableInputSpecSlicer to the ControllerContext, rather than ControllerImpl. This allows managing workers and work assignment differently when JVMs are reusable. 2) Lift the Controller Jersey resource out from ControllerChatHandler to a reusable resource. 3) Move memory introspection to a MemoryIntrospector interface, and introduce ControllerMemoryParameters that uses it. This makes it easier to run MSQ in process types other than Indexer and Peon. Both of these areas will have follow-ups that make similar changes on the worker side.
Configuration menu - View commit details
-
Copy full SHA for 6b7d766 - Browse repository at this point
Copy the full SHA 6b7d766View commit details -
Configuration menu - View commit details
-
Copy full SHA for 75376ce - Browse repository at this point
Copy the full SHA 75376ceView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1d80f02 - Browse repository at this point
Copy the full SHA 1d80f02View commit details -
Configuration menu - View commit details
-
Copy full SHA for 08b4671 - Browse repository at this point
Copy the full SHA 08b4671View commit details -
Configuration menu - View commit details
-
Copy full SHA for c30b1cc - Browse repository at this point
Copy the full SHA c30b1ccView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7a2693f - Browse repository at this point
Copy the full SHA 7a2693fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 292b143 - Browse repository at this point
Copy the full SHA 292b143View commit details
Commits on Apr 4, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 456436a - Browse repository at this point
Copy the full SHA 456436aView commit details
Commits on Apr 9, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 57ba025 - Browse repository at this point
Copy the full SHA 57ba025View commit details -
Configuration menu - View commit details
-
Copy full SHA for fc6d846 - Browse repository at this point
Copy the full SHA fc6d846View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7e99c13 - Browse repository at this point
Copy the full SHA 7e99c13View commit details -
Configuration menu - View commit details
-
Copy full SHA for 23b0cdf - Browse repository at this point
Copy the full SHA 23b0cdfView commit details
Commits on Apr 15, 2024
-
Configuration menu - View commit details
-
Copy full SHA for f26931e - Browse repository at this point
Copy the full SHA f26931eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 279ab58 - Browse repository at this point
Copy the full SHA 279ab58View commit details
Commits on Apr 16, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 6cce08a - Browse repository at this point
Copy the full SHA 6cce08aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5da7200 - Browse repository at this point
Copy the full SHA 5da7200View commit details
Commits on Apr 23, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 1d43eb2 - Browse repository at this point
Copy the full SHA 1d43eb2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 120a917 - Browse repository at this point
Copy the full SHA 120a917View commit details
Commits on Apr 26, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 87d6ff5 - Browse repository at this point
Copy the full SHA 87d6ff5View commit details
Commits on Apr 29, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 62be98e - Browse repository at this point
Copy the full SHA 62be98eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 05ec634 - Browse repository at this point
Copy the full SHA 05ec634View commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.