Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] New Search Processors (collapse, oversample, truncate_hits) #5151

Closed
1 of 4 tasks
macohen opened this issue Oct 4, 2023 · 0 comments · Fixed by #5881
Closed
1 of 4 tasks

[DOC] New Search Processors (collapse, oversample, truncate_hits) #5151

macohen opened this issue Oct 4, 2023 · 0 comments · Fixed by #5881
Assignees
Labels
3 - Done Issue is done/complete v2.12.0
Milestone

Comments

@macohen
Copy link
Contributor

macohen commented Oct 4, 2023

What do you want to do?

  • Request a change to existing documentation
  • Add new documentation
  • Report a technical problem with the documentation
  • Other

Tell us about your request. Provide a summary of the request and all versions that are affected.
New in 2.11 and should be placed here: https://opensearch.org/docs/latest/search-plugins/search-pipelines/search-processors/

The ability to create processors that can share data, plus three new processors are in the PR attached to the issue below. The three new search processors are type:

  1. "collapse" - given a field name, this response processor will find all fields with the same value within the returned results and collapse results on that value
  2. "oversample" - given a "sample_factor" (double >= 1.0) and "original_size" (the size of results originally requested at query time), this request processor will fetch sample_factor * original_size search hits from the index. This is useful for such operations as reranking results or other manipulation where the BM25 similarity scores may not be enough to produce desired results.
  3. "truncate_hits" - this response processor will reduce the number of hits returned in a response to "target_size" (integer >=0). This is useful when using "oversample" to reduce the number of hits to no more than what the original request expected.

What other resources are available? Provide links to related issues, POCs, steps for testing, etc.
opensearch-project/OpenSearch#6722

cc:@msfroh

@macohen macohen added this to the v2.11 milestone Oct 4, 2023
@macohen macohen added the v2.11.0 label Oct 4, 2023
@macohen macohen removed this from the v2.11 milestone Oct 4, 2023
@kolchfa-aws kolchfa-aws added 1 - Backlog Issue: The issue is unassigned or assigned but not started and removed untriaged labels Oct 4, 2023
@kolchfa-aws kolchfa-aws self-assigned this Oct 4, 2023
@macohen macohen added v2.12.0 and removed v2.11.0 labels Oct 4, 2023
@hdhalter hdhalter added this to the v2.12 milestone Oct 4, 2023
@kolchfa-aws kolchfa-aws added 3 - Done Issue is done/complete and removed 1 - Backlog Issue: The issue is unassigned or assigned but not started labels Dec 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Done Issue is done/complete v2.12.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants