-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Union in Decoupled planning #17354
base: master
Are you sure you want to change the base?
Conversation
processing/src/test/java/org/apache/druid/query/union/UnionQueryQueryToolChestTest.java
Fixed
Show fixed
Hide fixed
processing/src/test/java/org/apache/druid/query/union/UnionQueryRunnerTest.java
Fixed
Show fixed
Hide fixed
.thenReturn((q, ctx) -> (Sequence) scan2.makeResultSequence()); | ||
|
||
QueryRunner<UnionResult> unionRunner = toolChest.makeQueryRunner(query, walker); | ||
Sequence<UnionResult> results = unionRunner.run(QueryPlus.wrap(query), null); |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation Note test
CalciteTestBase.makeColumnExpression
public void test1() | ||
{ | ||
|
||
UnionQueryRunner uqr = new UnionQueryRunner(null, null); |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation Note test
CalciteTestBase.makeColumnExpression
@Override | ||
public DateTimeZone getTimezone() | ||
{ | ||
throw DruidException.defensive("Method supported. [%s]", DruidException.getCurrentMethodName()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need the method name in the message? It's part of the exception...
Also, your message is perhaps missing a not
? Are these intended to be "not yet implemented" or "not implemented and should never be implemented"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have not started with this - the exception traces are not always shown; which makes the message less meaningfull. Putting more detail into the exception might help identify the actual issue.
but that shouldn't be addressed individually like I was doing here....it would be better to improve on these things separately.
replaced all these exception creations with a DruidException#methodNotSupported
processing/src/main/java/org/apache/druid/query/DefaultQueryRunnerFactoryConglomerate.java
Outdated
Show resolved
Hide resolved
@Override | ||
@SuppressWarnings("unchecked") | ||
public <T, QueryType extends Query<T>> QueryExecutor<T> getQueryExecutor(QueryType query) | ||
{ | ||
QueryToolChest<T, QueryType> toolchest = getToolChest(query); | ||
if (toolchest instanceof QueryExecutor) { | ||
return (QueryExecutor<T>) toolchest; | ||
} | ||
return null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming issue: "QueryExecutor" is extremely generic, the name should more reflect where in the lifecycle the query is being executed, so that it's more clear what this is in charge of. We should also have javadoc describing what this is responsible for. I'd suggest something like, QueryEntryPointRunner
or something similar to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we name the interface QueryLogic
and have a method called entryPoint
. This would actually align with something else I'm doing as well, so I kinda like that. Especially as the distinction between the ToolChest and the QueryRunnerFactory has basically disappeared as the system has evolved.
default Query<T> withDataSources(List<DataSource> children) | ||
{ | ||
if (children.size() != 1) { | ||
throw new IAE("Must have exactly one child"); | ||
} | ||
return withDataSource(Iterables.getOnlyElement(children)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods for getDataSources
and withDataSources
I would hope that we can eliminate on the interface. They are a leaky abstraction, I think that any code that would actually need these would be able to be avoided by implementing the QueryExecutor
thingie which means that these methods don't need to exist on the interface anymore, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these were needed for 2 reasons:
- before this PR the
DataSource
class alredy had [get|with]Children methods ; since this changes the theQuery
to have multiple of them - it feeled like communicating it clearly is better - so that all existing places it - I had to also use this in
recursivelyClearContext
generateSubqueryIds
andinsertSubqueryIds
use these to set context variables
alternative options could be:
- there is possibly another way to not have these methods: provide a visitor interface which could "visit" all queries / datasources optionally rewriting them
- it might worth a try to service these differently from
QueryDataSource
(so that it does not appear on the mainQuery
interface)
I've done the second approach from above; I'm not sure about the complexity that will cause in the long run but its not that bad right now...I had to add 3 instanceof-s:
- 2 in
QueryDataSource
- 1 in a test related
clearRecursivelyContext
method
/** | ||
* Executes the query by utilizing the given walker. | ||
*/ | ||
public interface QueryExecutor<T> | ||
{ | ||
QueryRunner<T> makeQueryRunner( | ||
Query<T> query, | ||
QuerySegmentWalker walker | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this comment elsewhere as well, but what do you think about calling this QueryLogic
and then the method you are creating is initialEntryPoint()
?
processing/src/main/java/org/apache/druid/query/union/UnionQueryRunner.java
Outdated
Show resolved
Hide resolved
processing/src/main/java/org/apache/druid/query/union/UnionResult.java
Outdated
Show resolved
Hide resolved
|
||
QueryExecutor<Object> subQueryExecutor = conglomerate.getQueryExecutor(subQuery); | ||
final QueryRunner subQueryRunner; | ||
if (subQueryExecutor != null) { | ||
subQueryRunner = subQueryExecutor.makeQueryRunner(subQueryWithSerialization, this); | ||
} else { | ||
subQueryRunner = subQueryWithSerialization.getRunner(this); | ||
} | ||
|
||
queryResults = subQueryRunner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like the wrong place to be checking for the Executor.
Why can't we do it up around line 386?
server/src/test/java/org/apache/druid/query/QueryRunnerBasedOnClusteredClientTestBase.java
Outdated
Show resolved
Hide resolved
throw DruidException.defensive("XXXOnly Table and Values are supported as inputs for Union [%s]", sources); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure if this is truly a "defensive" exception or if it is indicative of bad user input? Is there a way that a user can do something to cause this exception to get thrown?
If we don't expect it to be seen, then a message like "Got an input type [%s] that is not supported. This should not happen".
.thenReturn((q, ctx) -> (Sequence) scan2.makeResultSequence()); | ||
|
||
QueryRunner<Object> unionRunner = toolChest.entryPoint(query, walker); | ||
Sequence<Object> results = unionRunner.run(QueryPlus.wrap(query), null); |
Check notice
Code scanning / CodeQL
Unread local variable Note test
.thenReturn((q, ctx) -> (Sequence) scan2.makeResultSequence()); | ||
|
||
QueryRunner<Object> unionRunner = toolChest.entryPoint(query, walker); | ||
Sequence<Object> results = unionRunner.run(QueryPlus.wrap(query), null); |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation Note test
UnionQuery
Query
UnionQuery
execution is driven by theQueryExecutor
- which could later enable to reduce some complexity inClientQuerySegmentWalker
UnionQueryToolChest
methods properly there was a need to get access to otherToolChest
-s ; most of the related test system refactor is done in Enhance injector usage during conglomerate builds in tests #17331UnionQueryRunner
toUnionDataSourceQueryRunner
QueryRunnerFactoryConglomerate
could act as aQueryToolChestWarehouse
which shaves of some unnecessary things here and there