Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throw an exception when MV columns are present in the order-by expression list in selection order-by only queries #9078

Merged

Conversation

somandal
Copy link
Contributor

@somandal somandal commented Jul 21, 2022

Today MV columns as the first expression in a Selection Order By query fails during query execution. The error seen today is:

18:25:22.961 BaseCombineOperator - Caught exception while processing query: QueryContext{_tableName='testTable', _subquery=null, _selectExpressions=[mvFloatCol], _aliasList=[null], _filter=mvFloatCol < '5', _groupByExpressions=null, _havingFilter=null, _orderByExpressions=[mvFloatCol ASC], _limit=10, _offset=0, _queryOptions={}, _expressionOverrideHints={}, _explain=false}
java.lang.ClassCastException: class [F cannot be cast to class java.lang.Comparable ([F and java.lang.Comparable are in module java.base of loader 'bootstrap')
	at org.apache.pinot.core.operator.combine.MinMaxValueBasedSelectionOrderByCombineOperator.processSegments(MinMaxValueBasedSelectionOrderByCombineOperator.java:210) ~[classes/:?]
	at org.apache.pinot.core.operator.combine.BaseCombineOperator$1.runJob(BaseCombineOperator.java:101) ~[classes/:?]
	at org.apache.pinot.core.util.trace.TraceRunnable.run(TraceRunnable.java:40) ~[classes/:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:834) [?:?]

This PR throws an UnsupportedOperationException during the server side planning phase when it identifies a MV column as the first expression in the selection order-by query to provide users with a better message about why the query fails.

Note that this PR only errors out the scenario when the MV column is the first expression in the order-by expression list rather than erroring out whenever any MV column is present in the order-by expression list. The reason for this is to maintain backward compatibility. Today if a mix of SV and MV columns are present in a selection order-by query, and if the first expression is a SV column then the query execution completes successfully. The position of the MV column has no affect on the actual ordering of results as the SelectionOrderByOperator only includes the SV columns in the comparator used to set the values in the PriorityQueue maintained by it. The query execution failure only occurs when the MV column is the first expression.

Selection order-by on MV columns don't make sense semantically which is why it is better to error out in a meaningful way as the error thrown on the query execution failure requires debugging to understand why it failed.

cc @siddharthteotia

@codecov-commenter
Copy link

codecov-commenter commented Jul 21, 2022

Codecov Report

Merging #9078 (a458592) into master (d778df2) will decrease coverage by 43.81%.
The diff coverage is 7.40%.

❗ Current head a458592 differs from pull request most recent head 79140cb. Consider uploading reports for the commit 79140cb to get more accurate results

@@              Coverage Diff              @@
##             master    #9078       +/-   ##
=============================================
- Coverage     69.90%   26.09%   -43.82%     
+ Complexity     4685       44     -4641     
=============================================
  Files          1862     1854        -8     
  Lines         99335    99143      -192     
  Branches      15115    15100       -15     
=============================================
- Hits          69444    25869    -43575     
- Misses        24967    70690    +45723     
+ Partials       4924     2584     -2340     
Flag Coverage Δ
integration1 26.09% <7.40%> (-0.14%) ⬇️
integration2 ?
unittests1 ?
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...a/org/apache/pinot/common/metrics/ServerMeter.java 100.00% <ø> (ø)
...core/operator/blocks/IntermediateResultsBlock.java 60.45% <ø> (-26.01%) ⬇️
...gregation/function/AggregationFunctionFactory.java 11.85% <0.00%> (-69.36%) ⬇️
...gation/function/CovarianceAggregationFunction.java 0.00% <0.00%> (ø)
...ot/segment/local/customobject/CovarianceTuple.java 0.00% <0.00%> (ø)
...che/pinot/segment/spi/AggregationFunctionType.java 0.00% <0.00%> (-89.54%) ⬇️
.../helix/FreshnessBasedConsumptionStatusChecker.java 0.00% <0.00%> (ø)
.../helix/IngestionBasedConsumptionStatusChecker.java 0.00% <0.00%> (ø)
...ter/helix/OffsetBasedConsumptionStatusChecker.java 0.00% <0.00%> (ø)
...va/org/apache/pinot/spi/utils/CommonConstants.java 0.00% <ø> (-27.70%) ⬇️
... and 1370 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@Jackie-Jiang
Copy link
Contributor

Good catch!
I'd suggest doing it slightly different which can handle more general cases:

  1. In MinMaxValueBasedSelectionOrderByCombineOperator.MinMaxValueContext, check if dataSourceMetadata.isSingleValue() and put null if it is not single valued so that we don't get exception
  2. In SelectionOrderByOperator.getComparator(), we have access to the actual metadata of the transformed result, and we may perform the check if we don't want MV to be ordered (currently it skips MV expressions, but IMO it is okay to throw exception to prevent unexpected behavior)

@walterddr
Copy link
Contributor

Good catch! I'd suggest doing it slightly different which can handle more general cases:

  1. In MinMaxValueBasedSelectionOrderByCombineOperator.MinMaxValueContext, check if dataSourceMetadata.isSingleValue() and put null if it is not single valued so that we don't get exception
  2. In SelectionOrderByOperator.getComparator(), we have access to the actual metadata of the transformed result, and we may perform the check if we don't want MV to be ordered (currently it skips MV expressions, but IMO it is okay to throw exception to prevent unexpected behavior)

+1 on both suggestions.

from this claim:

The position of the MV column has no affect on the actual ordering of results as the SelectionOrderByOperator only includes the SV columns in the comparator used to set the values in the PriorityQueue maintained by it. The query execution failure only occurs when the MV column is the first expression.

I think we should definitely not allow MV values present in the order-by list with explicit error, and we can do that in the parser level already right?

@somandal
Copy link
Contributor Author

somandal commented Jul 21, 2022

Good catch! I'd suggest doing it slightly different which can handle more general cases:

  1. In MinMaxValueBasedSelectionOrderByCombineOperator.MinMaxValueContext, check if dataSourceMetadata.isSingleValue() and put null if it is not single valued so that we don't get exception
  2. In SelectionOrderByOperator.getComparator(), we have access to the actual metadata of the transformed result, and we may perform the check if we don't want MV to be ordered (currently it skips MV expressions, but IMO it is okay to throw exception to prevent unexpected behavior)

Thanks for the suggestions @Jackie-Jiang
I didn't understand what you meant by (1) above. What should I set to null?

For (2), we were thinking about throwing there, but there are a few problems.

  1. We want to keep the explicit exception throwing backward compatible. Only if the first field is an identifier does the code path go through the MinMaxValueBasedSelectionOrderByCombineOperator and already fail during query execution. Transform type expressions as the first order by expression won't go through MinMaxValueBasedSelectionOrderByCombineOperator and if we now throw if the column is not a singleValue column then if anyone has existing queries with transform they may fail.
  2. Did you want us to catch multi-value identifiers across all order-by expressions? for the same reason as (1) we wanted to avoid that because people might be running queries today where their first order by expression is single value identifier and with this change their query might fail.
  3. We also wanted to catch this issue before starting the query execution. That's why we thought planning phase might be a good idea.

Let me know your thoughts on the above

@somandal
Copy link
Contributor Author

somandal commented Jul 21, 2022

+1 on both suggestions.

from this claim:

The position of the MV column has no affect on the actual ordering of results as the SelectionOrderByOperator only includes the SV columns in the comparator used to set the values in the PriorityQueue maintained by it. The query execution failure only occurs when the MV column is the first expression.

I think we should definitely not allow MV values present in the order-by list with explicit error, and we can do that in the parser level already right?

I did consider that, but we may need to add more checks on the broker level if we wanted to do it there. The server side figured out which queries are aggregation types already and creates a different PlanNode for that. We'd have to add some of that logic on broker side if we wanted to do this at the parser level. Also, this is not a problem if GROUP BY is included in the query as GROUP BY flattens out the column. Let me know your thoughts.

QueryContextUtils.isAggregationQuery(queryContext) in makeSegmentPlanNode() is pre-populated when the QueryContext is created, so I thought it was easier to reuse this.

@@ -250,13 +250,33 @@ public PlanNode makeSegmentPlanNode(IndexSegment indexSegment, QueryContext quer
return new AggregationPlanNode(indexSegment, queryContext);
}
} else if (QueryContextUtils.isSelectionQuery(queryContext)) {
validateSelectionOrderByExpressions(indexSegment, queryContext);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the earliest stage that this check can be done? We should try to do such semantic checks as soon as possible (during or right after parsing) so that the query can be failed as quickly as possible without consuming any extra resources. There are a bunch of checks in BrokerRequestHandler, so I am wondering if this check can be done there instead of in server side plan generation?

@Jackie-Jiang
Copy link
Contributor

@somandal The idea here is to throw exception when user tries to order-by any expression that is MV. The earliest time when we know the expression is SV or MV is in the constructor of SelectionOrderByOperator, so we can catch it before the query execution. Actually we shouldn't need to change the MinMaxValueBasedSelectionOrderByCombineOperator because it is constructed in the execution phase, which is after the creation of SelectionOrderByOperator.

@siddharthteotia
Copy link
Contributor

siddharthteotia commented Jul 21, 2022

So there are broadly 2 categories of MV specific order by expressions for selection order by queries

  • SELECT ..... ORDER BY <first expression is MV identifier expression> , 0 or more expressions could be SV or MV
  • SELECT ..... ORDER BY <first expression is MV transform expression>, 0 or more expressions could be SV or MV

I agree that changing min-max operator is too late and the problem can be solved / detected sooner.

I also agree that purist way of solving this is to detect the presence of any MV expression in the ORDER BY expressions like in SelectionOrderByOperator constructor and throw exception. I think it is a clean solution.

However, currently the engine lets 2nd category of queries go through by ignoring the MV expression in the ORDER BY list. The 1st category fails later during execution with not a meaningful error in the min-max operator when trying to execute first expression.

Making changes to constructor of SelectionOrderByOperator will fail both categories (which is good ) but I have no idea if our production has any queries from 2nd category so there is a potential of breaking prod queries (if any) not just for us but even for OSS customers / users. There is no way for me to find percentage of such queries from our prod logs

To continue running 2nd category queries (like today), we decided not to make the operator change and instead fail during server planning by special casing for category 1 queries.

Any thoughts on the risk to do the clean/pure solution ? If there is such query in production, we may have to rollback or force the client app to change queries. cc @somandal @Jackie-Jiang @walterddr

@Jackie-Jiang
Copy link
Contributor

@siddharthteotia I understand the concern of breaking some prod queries, but IMO it is better to return exception rather than return unexpected result. The failed queries won't serve the purpose anyway, and returning an exception with proper error message is the best way to catch bad queries. Also, I don't think people have such queries, or they will complain when we added the MinMaxValueBasedSelectionOrderByCombineOperator

@siddharthteotia
Copy link
Contributor

@Jackie-Jiang @somandal

I think we are on the same page here w.r.t what is the end state here -- block any MV expressions in ORDER BY via throwing exception in SelectionOrderByOperator constructor.

I am not comfortable with making a change that may fail existing queries. Shall we do this ?

  • Merge this change along with another change to detect presence of queries belonging to 2nd category (in my comment above) via a metric.
  • Follow up in a couple of weeks time after having changed user queries (if any) depending on metric we observe internally and make the change in SelectionOrderByOperator for the general scenario.

I am also ok with not doing this change at all and just do the metric change since when the final solution is implemented, the current change will probably be deleted anyway.

@Jackie-Jiang
Copy link
Contributor

It is a good idea to first emit metrics, then after monitoring for some time, change it to throw exceptions. I'd suggest only changing the SelectionOrderByOperator because that is the correct place to handle this check.

@somandal
Copy link
Contributor Author

It is a good idea to first emit metrics, then after monitoring for some time, change it to throw exceptions. I'd suggest only changing the SelectionOrderByOperator because that is the correct place to handle this check.

Hey @Jackie-Jiang, I've created a new PR: #9117 to address the metrics change. Once we get that in and monitor the usage on our end for such queries, I'll get back to this fix and make the clean change in the SelectionOrderByOperator directly. Can you take a look at that metrics change when you get a chance? Thanks!

@somandal somandal force-pushed the throw-exception-mv-first-select-orderby branch from 0bb6bc8 to b00ac55 Compare August 25, 2022 19:50
@somandal somandal changed the title Throw an exception when MV columns are the first order-by expression in selection order-by queries Throw an exception when MV columns are present in the order-by expression list in selection order-by only queries Aug 25, 2022
@somandal
Copy link
Contributor Author

It is a good idea to first emit metrics, then after monitoring for some time, change it to throw exceptions. I'd suggest only changing the SelectionOrderByOperator because that is the correct place to handle this check.

Hey @siddharthteotia and @Jackie-Jiang I've cleaned up this code, removed the metric and added the exception as part of the SelectionOrderByOperator. Can you folks take a look?

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor comments

@Jackie-Jiang Jackie-Jiang merged commit 53c117f into apache:master Aug 28, 2022
@somandal somandal deleted the throw-exception-mv-first-select-orderby branch August 29, 2022 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants