Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track total hits up to 10,000 by default #37466

Merged
merged 13 commits into from
Jan 25, 2019

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Jan 15, 2019

This commit changes the default for the track_total_hits option of the search request
to 10,000. This means that by default search requests will accurately track the total hit count
up to 10,000 documents, requests that match more than this value will set the "total.relation"
to "gte" (e.g. greater than or equals) and the "total.value" to 10,000 in the search response.
Scroll queries are not impacted, they will continue to count the total hits accurately.
The default is set back to true (accurate hit count) if rest_total_hits_as_int is set in the search request.
I choose 10,000 as the default because that's also the number we use to limit pagination. This means that
users will be able to know how far they can jump (up to 10,000) even if the total number of hits is not accurate.

Closes #33028

This commit changes the default for the `track_total_hits` option of the search request
to `10,000`. This means that by default search requests will accurately track the total hit count
up to `10,000` documents, requests that match more than this value will set the `"total.relation"`
to `"gte"` (e.g. greater than or equals) and the `"total.value"` to `10,000` in the search response.
Scroll queries are not impacted, they will continue to count the total hits accurately.
The default is set back to `true` (accurate hit count) if `rest_total_hits_as_int` is set in the search request.
I choose `10,000` as the default because that's also the number we use to limit pagination. This means that
users will be able to know how far they can jump (up to 10,000) even if the total number of hits is not accurate.

Closes elastic#33028
@jimczi jimczi added >enhancement >breaking :Search/Search Search-related issues that do not fall into other categories labels Jan 15, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@jimczi jimczi requested a review from jpountz January 15, 2019 10:45
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good. We will probably need to talk with other teams before merging this change so that they set track_total_hits=true and can deal with disabling exact counts later?

@jimczi
Copy link
Contributor Author

jimczi commented Jan 16, 2019

run the gradle build tests 1

@jimczi
Copy link
Contributor Author

jimczi commented Jan 16, 2019

@elasticmachine run gradle build tests 1

@jimczi
Copy link
Contributor Author

jimczi commented Jan 24, 2019

@elasticmachine run elasticsearch-ci/2

@jimczi jimczi merged commit 787acb1 into elastic:master Jan 25, 2019
@jimczi jimczi deleted the track_total_hits_default branch January 25, 2019 12:45
@jimczi jimczi added the v7.0.0 label Feb 1, 2019
// no matter what the value of track_total_hits is
return SearchContext.TRACK_TOTAL_HITS_ACCURATE;
}
return request.source() == null ? SearchContext.DEFAULT_TRACK_TOTAL_HITS_UP_TO : request.source().trackTotalHitsUpTo() == null ?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the request.source() == null? check necessary on line 704? Seems like it's redundant given the check on line 700

p.s: I'm a curious engineer looking to learn more about elasticsearch's internals. I'm reading the diffs in my spare time to learn how experts in the field are building real-world distributed systems. Thanks for taking the time to answer my questions!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check is on request.scroll which is different than request.source so it is necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking >enhancement :Search/Search Search-related issues that do not fall into other categories v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants