Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Search pipelines] Add Global Ignore_failure options for Processors #8373

Merged

Conversation

mingshl
Copy link
Contributor

@mingshl mingshl commented Jun 30, 2023

Description

Search processors can be chained and process one after the other. Currently, when one of the processor got an exception, it immediately returns the exceptions, lost the result of the previously processing and also prevent the rest of the pipeline from processing.

By adding ignore_failure option in the processor, when a processor sets ignore_failure is true, and it triggers exceptions, it would get caught and logged into the failure count, but it won't stop the other processors to proceed further.

For example,

for a document,

curl -XPUT localhost:9200/test2/_doc/1 --data '
{ "customer_id" : "coco", "product_title": "Wine","date" :"2023-04-01", "category" : "alcohol","internal_customer":false}' -H "Content-Type:Application/json"

we create a pipeline guest_pipeline_ignore_failure_false with a filter query processor and a rename processor without setting ignore_failure, which is default to be false. This pipeline is aiming at standardizing a field migration that named "product_1" before and now renamed to "product_name" and filter out internal customer orders.

curl -XPUT localhost:9200/_search/pipeline/guest_pipeline_ignore_failure_false --data '{
  "description": "A pipeline for guests to browse order history, which filters out internal customers orders, and standardized field migration fields",
  "request_processors": [
    {
      "filter_query": {
        "filterQuery": {
          "query": {
            "match": {
              "internal_customer": false
            }
          }
        }
      }
    }
  ],
  "response_processors": [
    {
      "rename_field": {
        "target_field": "product_1",
        "field": "product_name",
        "ignore_missing": false
      }
    }
  ]
} ' -H "Content-Type:Application/json"

query with guest_pipeline_ignore_failure_false, will end up exceptions, because this is a new document, and there is no old field named product_1 in it.

curl -XGET "localhost:9200/test2/_search?search_pipeline=guest_pipeline_ignore_failure_false" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match_all":{}
  }
}'

{"error":{"root_cause":[{"type":"null_pointer_exception","reason":null}],"type":"null_pointer_exception","reason":null},"status":500}%   

But if we make another search pipeline enabled the option ignore_failure is true, then even though the renamed processor got exception, the filter query processor still works and return the document.

curl -XPUT localhost:9200/_search/pipeline/guest_pipeline_ignore_failure_true --data '{
  "description": "A pipeline for guests to browse order history, which filters out internal customers orders, and standardized field migration fields",
  "request_processors": [
    {
      "filter_query": {
        "ignore_failure":false,
        "filterQuery": {
          "query": {
            "match": {
              "internal_customer": false
            }
          }
        }
      }
    }
  ],
  "response_processors": [
    {
      "rename_field": {
        "target_field": "product_1",
        "ignore_failure":true,
        "field": "product_name",
        "ignore_missing": false
      }
    }
  ]
} ' -H "Content-Type:Application/json"
curl -XGET "localhost:9200/test2/_search?search_pipeline=guest_pipeline_ignore_failure_true" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match_all":{}
  }
}'

{"took":29,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":1.0,"hits":[{"_index":"test2","_id":"1","_score":1.0,"_source":
{ "customer_id" : "coco", "product_title": "Wine","date" :"2023-04-01", "category" : "","internal_customer":false}}]}}%   

Related Issues

8308](#8308)

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@mingshl mingshl changed the title Add ignore failure search pipeline Add Global Ingore_failure options for Processors Jun 30, 2023
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@mingshl mingshl changed the title Add Global Ingore_failure options for Processors [Search pipelines] Add Global Ingore_failure options for Processors Jun 30, 2023
@mingshl mingshl changed the title [Search pipelines] Add Global Ingore_failure options for Processors [Search pipelines] Add Global Igore_failure options for Processors Jun 30, 2023
@mingshl mingshl changed the title [Search pipelines] Add Global Igore_failure options for Processors [Search pipelines] Add Global Ignore_failure options for Processors Jun 30, 2023
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@mingshl
Copy link
Contributor Author

mingshl commented Jul 11, 2023

reproduce gradle check error by running ./gradlew ':server:test' --tests "org.opensearch.index.shard.RemoteStoreRefreshListenerTests" -Dtests.seed=AA4727F034E1E36D -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en,
Got a succeed message:

BUILD SUCCESSFUL in 34s
47 actionable tasks: 1 executed, 46 up-to-date

gradle check failed not related to the code change

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: null ❌
  • URL:
  • CommitID: 8ef7768
    Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
    Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      2 org.opensearch.remotestore.RemoteStoreIT.testStaleCommitDeletionWithoutInvokeFlush
      1 org.opensearch.search.pit.DeletePitMultiNodeIT.testDeleteWhileSearch
      1 org.opensearch.remotestore.RemoteStoreIT.testStaleCommitDeletionWithInvokeFlush

@mingshl
Copy link
Contributor Author

mingshl commented Jul 11, 2023

Thanks @andrross! Ready for merging.

@reta
Copy link
Collaborator

reta commented Jul 11, 2023

@mingshl @msfroh the general question I have now - how do the consumer know which pipeline processors failed? Or at least, that some pipeline processors failed? It seems to me we need at least somehow propagate that back into the response, or the consumer would happily believe things are fine ...

@mingshl
Copy link
Contributor Author

mingshl commented Jul 11, 2023

@mingshl @msfroh the general question I have now - how do the consumer know which pipeline processors failed? Or at least, that some pipeline processors failed? It seems to me we need at least somehow propagate that back into the response, or the consumer would happily believe things are fine ...

The default of ignoreFailure option is false, then will throw error as normal.
If setting the ignoreFailure option is true, when catching the exception, it generates warn level logging here and also increments the failedProcessor count here, and the failed processors stats, with the name of the processor and the name of the pipeline, can be retrieved in the search pipeline metrics, please see the testStatsEnabledIgnoreFailure here .

Hope this make senses.

Signed-off-by: Mingshi Liu <mingshl@amazon.com>
@reta
Copy link
Collaborator

reta commented Jul 11, 2023

@mingshl @msfroh the general question I have now - how do the consumer know which pipeline processors failed? Or at least, that some pipeline processors failed? It seems to me we need at least somehow propagate that back into the response, or the consumer would happily believe things are fine ...

The default of ignoreFailure option is false, then will throw error as normal. If setting the ignoreFailure option is true, when catching the exception, it generates warn level logging here and also increments the failedProcessor count here, and the failed processors stats, with the name of the processor and the name of the pipeline, can be retrieved in the search pipeline metrics, please see the testStatsEnabledIgnoreFailure here .

Hope this make senses.

This is helpful in general but not for individual users:

  • the users don't look in logs, may not even have access to that
  • the stats are aggregated across all requests, not only the one concrete search query the consumer is running

also increments the failedProcessor count here,

Do we return it back with the search response? That would be helpful for sure.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@mingshl
Copy link
Contributor Author

mingshl commented Jul 11, 2023

@mingshl @msfroh the general question I have now - how do the consumer know which pipeline processors failed? Or at least, that some pipeline processors failed? It seems to me we need at least somehow propagate that back into the response, or the consumer would happily believe things are fine ...

The default of ignoreFailure option is false, then will throw error as normal. If setting the ignoreFailure option is true, when catching the exception, it generates warn level logging here and also increments the failedProcessor count here, and the failed processors stats, with the name of the processor and the name of the pipeline, can be retrieved in the search pipeline metrics, please see the testStatsEnabledIgnoreFailure here .
Hope this make senses.

This is helpful in general but not for individual users:

  • the users don't look in logs, may not even have access to that
  • the stats are aggregated across all requests, not only the one concrete search query the consumer is running

also increments the failedProcessor count here,

Do we return it back with the search response? That would be helpful for sure.

That's a good point. The response from any pipelines(search pipeline/ingest pipeline) will look like the same response as other search response. That's one of the charms using search pipelines. So we can NOT tell if a response has gone through the ignoreFailure, and that is also similar to the existing design of ignore Failure in the ingest pipeline here

The search pipelines stats does gather stats for all requests using pipeline API. And the response from the stats look like this .

If we would like to expose the information from the search pipeline in the search response, for example, the pipeline name, the number of processor within the pipeline, the processor's name, is it ignoreFailure, the failure processor count, we can discuss in this new issue and might implement in the future release.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.index.shard.RemoteStoreRefreshListenerTests.testRefreshSuccessOnThirdAttemptAttempt

@reta
Copy link
Collaborator

reta commented Jul 11, 2023

If we would like to expose the information from the search pipeline in the search response, for example, the pipeline name, the number of processor within the pipeline, the processor's name, is it ignoreFailure, the failure processor count, we can discuss in this new issue and might implement in the future release.

I think that would be useful, could you please create an issue for that? Certainly not enough time to make it to 2.9.0 but we could merge this change and improve on it later. Thank you

@mingshl
Copy link
Contributor Author

mingshl commented Jul 11, 2023

[Search Pipeline] Expose Search Pipeline Stats in Search Response #8635

issue created #8635 referenced this PR as well. Are we good to merge this PR? Thanks for rerunning the tests, and they all passed now. THANK YOU @reta

@reta reta merged commit 2004ba0 into opensearch-project:main Jul 11, 2023
9 checks passed
@reta reta added the backport 2.x Backport to 2.x branch label Jul 11, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jul 11, 2023
…8373)

* Add Global Ingore_failure options for Processors

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* add changelog

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add ignore_failure to 40_rename_response

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Change Boolean to boolean and refactor AbstractProcessor

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* rename to isIgnoreFailure and add tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* rename to isIgnoreFailure and add tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* add ignoreFailure to runSearchPhaseResultsTransformer

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* fix filter query and change log warn message

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add test on matching each processor stat

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add test on matching each processor stat

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* remove extra spaces and words

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* use IGNORE_FAILURE_KEY

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

---------

Signed-off-by: Mingshi Liu <mingshl@amazon.com>
(cherry picked from commit 2004ba0)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
reta pushed a commit that referenced this pull request Jul 11, 2023
…8373) (#8637)

* Add Global Ingore_failure options for Processors



* add changelog



* Add ignore_failure to 40_rename_response



* Change Boolean to boolean and refactor AbstractProcessor



* rename to isIgnoreFailure and add tests



* rename to isIgnoreFailure and add tests



* add ignoreFailure to runSearchPhaseResultsTransformer



* fix filter query and change log warn message



* Add test on matching each processor stat



* Add test on matching each processor stat



* remove extra spaces and words



* use IGNORE_FAILURE_KEY



---------


(cherry picked from commit 2004ba0)

Signed-off-by: Mingshi Liu <mingshl@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
vikasvb90 pushed a commit to raghuvanshraj/OpenSearch that referenced this pull request Jul 12, 2023
…pensearch-project#8373)

* Add Global Ingore_failure options for Processors

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* add changelog

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add ignore_failure to 40_rename_response

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Change Boolean to boolean and refactor AbstractProcessor

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* rename to isIgnoreFailure and add tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* rename to isIgnoreFailure and add tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* add ignoreFailure to runSearchPhaseResultsTransformer

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* fix filter query and change log warn message

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add test on matching each processor stat

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add test on matching each processor stat

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* remove extra spaces and words

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* use IGNORE_FAILURE_KEY

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

---------

Signed-off-by: Mingshi Liu <mingshl@amazon.com>
raghuvanshraj pushed a commit to raghuvanshraj/OpenSearch that referenced this pull request Jul 12, 2023
…pensearch-project#8373)

* Add Global Ingore_failure options for Processors

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* add changelog

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add ignore_failure to 40_rename_response

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Change Boolean to boolean and refactor AbstractProcessor

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* rename to isIgnoreFailure and add tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* rename to isIgnoreFailure and add tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* add ignoreFailure to runSearchPhaseResultsTransformer

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* fix filter query and change log warn message

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add test on matching each processor stat

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add test on matching each processor stat

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* remove extra spaces and words

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* use IGNORE_FAILURE_KEY

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

---------

Signed-off-by: Mingshi Liu <mingshl@amazon.com>
dzane17 pushed a commit to dzane17/OpenSearch that referenced this pull request Jul 12, 2023
…pensearch-project#8373)

* Add Global Ingore_failure options for Processors

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* add changelog

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add ignore_failure to 40_rename_response

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Change Boolean to boolean and refactor AbstractProcessor

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* rename to isIgnoreFailure and add tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* rename to isIgnoreFailure and add tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* add ignoreFailure to runSearchPhaseResultsTransformer

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* fix filter query and change log warn message

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add test on matching each processor stat

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add test on matching each processor stat

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* remove extra spaces and words

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* use IGNORE_FAILURE_KEY

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

---------

Signed-off-by: Mingshi Liu <mingshl@amazon.com>
buddharajusahil pushed a commit to buddharajusahil/OpenSearch that referenced this pull request Jul 18, 2023
…pensearch-project#8373)

* Add Global Ingore_failure options for Processors

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* add changelog

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add ignore_failure to 40_rename_response

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Change Boolean to boolean and refactor AbstractProcessor

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* rename to isIgnoreFailure and add tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* rename to isIgnoreFailure and add tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* add ignoreFailure to runSearchPhaseResultsTransformer

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* fix filter query and change log warn message

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add test on matching each processor stat

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add test on matching each processor stat

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* remove extra spaces and words

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* use IGNORE_FAILURE_KEY

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

---------

Signed-off-by: Mingshi Liu <mingshl@amazon.com>
Signed-off-by: sahil buddharaju <sahilbud@amazon.com>
baba-devv pushed a commit to baba-devv/OpenSearch that referenced this pull request Jul 29, 2023
…pensearch-project#8373)

* Add Global Ingore_failure options for Processors

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* add changelog

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add ignore_failure to 40_rename_response

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Change Boolean to boolean and refactor AbstractProcessor

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* rename to isIgnoreFailure and add tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* rename to isIgnoreFailure and add tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* add ignoreFailure to runSearchPhaseResultsTransformer

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* fix filter query and change log warn message

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add test on matching each processor stat

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add test on matching each processor stat

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* remove extra spaces and words

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* use IGNORE_FAILURE_KEY

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

---------

Signed-off-by: Mingshi Liu <mingshl@amazon.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…pensearch-project#8373)

* Add Global Ingore_failure options for Processors

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* add changelog

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add ignore_failure to 40_rename_response

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Change Boolean to boolean and refactor AbstractProcessor

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* rename to isIgnoreFailure and add tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* rename to isIgnoreFailure and add tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* add ignoreFailure to runSearchPhaseResultsTransformer

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* fix filter query and change log warn message

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add test on matching each processor stat

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* Add test on matching each processor stat

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* remove extra spaces and words

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* use IGNORE_FAILURE_KEY

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

---------

Signed-off-by: Mingshi Liu <mingshl@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch v2.9.0 'Issues and PRs related to version v2.9.0'
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

7 participants