Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add method that parses dates without throwing exceptions #83801

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

danhermann
Copy link
Contributor

To be used to improve the performance of the date processor when it encounters a lot of date patterns that do not match the supplied input.

Relates to #73918

@danhermann danhermann added :Core/Infra/Core Core issues without another label :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >refactoring v8.2.0 labels Feb 10, 2022
@elasticmachine elasticmachine added Team:Core/Infra Meta label for core/infra team Team:Data Management Meta label for data/management team labels Feb 10, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@danhermann
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/bwc

@danhermann
Copy link
Contributor Author

@elasticmachine update branch

* @return Tuple containing a boolean value indicating parsing success or failure and a java time object containing
* the parsed input in the case of successful parsing
*/
Tuple<Boolean, TemporalAccessor> parseWithoutException(String input);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it expected that a success could result in a null TemporalAccessor? That is, is it acceptable to return Tuple.tuple(true, null)?

If not, what about Optional<TemporalAccessor>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Done in 15622ec

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain how this differs from the existing parse method? The existing method doesn't catch exceptions, it looks almost identical to your new implementation. If the goal is to simply do away with any exceptions, then I think we should change the existing parse method to work like you have, but return null on no match. Having two methods will make it unclear which should be used.

* @return Tuple containing a boolean value indicating parsing success or failure and a java time object containing
* the parsed input in the case of successful parsing
*/
Optional<TemporalAccessor> parseWithoutException(String input);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we should add another parse method. If this is more performant, then let's change the implementation of parse.

@danhermann
Copy link
Contributor Author

Could you explain how this differs from the existing parse method? The existing method doesn't catch exceptions, it looks almost identical to your new implementation. If the goal is to simply do away with any exceptions, then I think we should change the existing parse method to work like you have, but return null on no match. Having two methods will make it unclear which should be used.

@rjernst, the existing method will throw an exception if none of the parsers complete successfully. I believe this was the intended behavior of the method given the explicit throw in the doParse method and its corresponding @throws entry in the method javadoc. This causes performance problems in a number of solutions use cases that make heavy use of the date processor in ingest pipelines since date parsing failures are not exceptional there.

This method is also used in many places in ES (mappings, cluster metadata, etc.) where its behavior is at least expected if not preferred since exceptions are explicitly caught and handled but null return values are not. I wouldn't presume to change all those other usages as a date parsing failure in those cases may very well be exceptional.

@rjernst
Copy link
Member

rjernst commented Feb 14, 2022

I wouldn't presume to change all those other usages as a date parsing failure in those cases may very well be exceptional.
use cases that make heavy use of the date processor in ingest pipelines since date parsing failures are not exceptional there.

I don't think date parsing failure in other cases within Elasticsearch is any more or less exceptional. I would argue that even in ingest processors not having any patterns match is exceptional. Can you explain in what situation not matching any patterns is expected and normal?

I don't have a preference for either way, but I think we should stick with one type of error handling. Your new method is almost a copy of the existing parse method. These kinds of things can become tricky to keep in sync as changes are made over time. And like I mentioned in my previous comment, having 2 different ways to call a parser is confusing for callers, not knowing which is preferred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >refactoring Team:Core/Infra Meta label for core/infra team Team:Data Management Meta label for data/management team v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.