Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forbid expensive query parts in ranking evaluation #30151

Merged
merged 4 commits into from
May 14, 2018

Conversation

cbuescher
Copy link
Member

Currently the ranking evaluation API accepts the full query syntax for
the queries specified in the evaluation set and executes them via multi
search. This potentially runs costly aggregations and suggestions too.
This change adds checks that forbid using aggregations, suggesters or
highlighters in the queries that are run as part of the ranking
evaluation since they are irrelevent in the context of this API.

Closes #29674

@cbuescher cbuescher added review :Search Relevance/Ranking Scoring, rescoring, rank evaluation. v7.0.0 v6.4.0 labels Apr 25, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

Currently the ranking evaluation API accepts the full query syntax for
the queries specified in the evaluation set and executes them via multi
search. This potentially runs costly aggregations and suggestions too.
This change adds checks that forbid using aggregations, suggesters or
highlighters in the queries that are run as part of the ranking
evaluation since they are irrelevent in the context of this API.

Closes elastic#29674
@cbuescher cbuescher force-pushed the rankEval-forbidQueryElements branch from 68ed7f5 to 6084785 Compare April 26, 2018 10:23
@cbuescher
Copy link
Member Author

@elasticmachine run sample packaging tests

this(id, ratedDocs, testRequest, new HashMap<>(), null);
static void validateEvaluatedQuery(SearchSourceBuilder evaluationRequest) {
// ensure that testRequest, if set, does not contain aggregation, suggest or highlighting section
if (evaluationRequest != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if profile and explain should be forbidden too? Both have non-negligible impact on performance, and seem irrelevant to ranking as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I think I will add that before I merge. Thanks.

@polyfractal
Copy link
Contributor

Left a minor comment but otherwise LGTM :)

@cbuescher cbuescher merged commit cc93131 into elastic:master May 14, 2018
cbuescher pushed a commit that referenced this pull request May 14, 2018
Currently the ranking evaluation API accepts the full query syntax for
the queries specified in the evaluation set and executes them via multi
search. This potentially runs costly aggregations and suggestions too.
This change adds checks that forbid using aggregations, suggesters, 
highlighters and the explain and profile options in the queries that are 
run as part of the ranking evaluation since they are irrelevent in the 
context of this API.
dnhatn added a commit that referenced this pull request May 14, 2018
* master:
  Default to one shard (#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (#30151)
  Docs: Update HighLevelRestClient migration docs (#30544)
  Clients: Switch to new performRequest (#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (#30531)
  Moved tokenizers to analysis common module (#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (#30530)
  Deprecate not copy settings and explicitly disallow (#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (#30535)
  SQL: Use request flavored methods in tests (#30345)
  Suppress hdfsFixture if there are spaces in the path (#30302)
  Delete temporary blobs before creating index file (#30528)
  Watcher: Remove TriggerEngine.getJobCount() (#30395)
  [ML] Fix wire BWC for JobUpdate (#30512)
  Use simpler write-once semantics for FS repository (#30435)
  Derive max composite buffers from max content len
  Use simpler write-once semantics for HDFS repository (#30439)
  SQL: Improve correctness of SYS COLUMNS & TYPES (#30418)
  Mute two tests in FlushIT with @AwaitsFix.
  Fix incorrect template name in test case
  Build: Remove legacy bwc files from xpack (#30485)
  Mute UnicastZenPingTests#testSimplePings with @AwaitsFix.
  Security: cleanup code in file stores (#30348)
  Security: fix TokenMetaData equals and hashcode (#30347)
  Mute two tests from SmokeTestWatcherWithSecurityClientYamlTestSuiteIT.
  Mute SharedClusterSnapshotRestoreIT#testSnapshotSucceedsAfterSnapshotFailure with @AwaitsFix.
  SQL: Improve compatibility with MS query (#30516)
  SQL: Fix parsing of dates with milliseconds (#30419)
dnhatn added a commit that referenced this pull request May 14, 2018
* 6.x:
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (#30151)
  Docs: Update HighLevelRestClient migration docs (#30544)
  Clients: Switch to new performRequest (#30543)
  [TEST] Fix typo in MovAvgIT test
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Moved tokenizers to analysis common module (#30538)
  Document woes between auto-expand-replicas and allocation filtering (#30531)
  [ML] Hide internal Job update options from the REST API (#30537)
  Deprecate not copy settings and explicitly disallow (#30404)
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (#30530)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (#30535)
  Bump Gradle heap to 1792m (#30484)
  SQL: Use request flavored methods in tests (#30345)
  Suppress hdfsFixture if there are spaces in the path (#30302)
  Delete temporary blobs before creating index file (#30528)
  Watcher: Remove TriggerEngine.getJobCount() (#30395)
  Use simpler write-once semantics for FS repository (#30435)
  Use simpler write-once semantics for HDFS repository (#30439)
  SQL: Improve correctness of SYS COLUMNS & TYPES (#30418)
  Mute two tests in FlushIT with @AwaitsFix.
  Fix incorrect template name in test case
  Build: Remove legacy bwc files from xpack (#30485)
  Security: Simplify security index listeners (#30466)
  Mute SharedClusterSnapshotRestoreIT#testSnapshotSucceedsAfterSnapshotFailure with @AwaitsFix.
  Add proper longitude validation in geo_polygon_query (#30497)
  Mute UnicastZenPingTests#testSimplePings with @AwaitsFix.
  Security: cleanup code in file stores (#30348)
  Security: fix TokenMetaData equals and hashcode (#30347)
  Mute two tests from SmokeTestWatcherWithSecurityClientYamlTestSuiteIT.
  Fix incorrect merged entry in changelog
  SQL: Improve compatibility with MS query (#30516)
  SQL: Fix parsing of dates with milliseconds (#30419)
martijnvg added a commit that referenced this pull request May 15, 2018
* es/ccr: (37 commits)
  Default to one shard (#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (#30151)
  Docs: Update HighLevelRestClient migration docs (#30544)
  Clients: Switch to new performRequest (#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (#30531)
  Moved tokenizers to analysis common module (#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (#30530)
  Deprecate not copy settings and explicitly disallow (#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (#30535)
  SQL: Use request flavored methods in tests (#30345)
  Suppress hdfsFixture if there are spaces in the path (#30302)
  ...
martijnvg added a commit that referenced this pull request May 15, 2018
* es/ccr: (37 commits)
  Default to one shard (#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (#30151)
  Docs: Update HighLevelRestClient migration docs (#30544)
  Clients: Switch to new performRequest (#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (#30531)
  Moved tokenizers to analysis common module (#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (#30530)
  Deprecate not copy settings and explicitly disallow (#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (#30535)
  SQL: Use request flavored methods in tests (#30345)
  Suppress hdfsFixture if there are spaces in the path (#30302)
  ...
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 15, 2018
* es/ccr: (37 commits)
  Default to one shard (elastic#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (elastic#30151)
  Docs: Update HighLevelRestClient migration docs (elastic#30544)
  Clients: Switch to new performRequest (elastic#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (elastic#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (elastic#30531)
  Moved tokenizers to analysis common module (elastic#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (elastic#30530)
  Deprecate not copy settings and explicitly disallow (elastic#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (elastic#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (elastic#30535)
  SQL: Use request flavored methods in tests (elastic#30345)
  Suppress hdfsFixture if there are spaces in the path (elastic#30302)
  ...
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 15, 2018
* es/ccr: (37 commits)
  Default to one shard (elastic#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (elastic#30151)
  Docs: Update HighLevelRestClient migration docs (elastic#30544)
  Clients: Switch to new performRequest (elastic#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (elastic#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (elastic#30531)
  Moved tokenizers to analysis common module (elastic#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (elastic#30530)
  Deprecate not copy settings and explicitly disallow (elastic#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (elastic#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (elastic#30535)
  SQL: Use request flavored methods in tests (elastic#30345)
  Suppress hdfsFixture if there are spaces in the path (elastic#30302)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants