-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support functions in Data Prepper expressions #2626 #2644
Conversation
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
public Object applyFunction(final String functionName, final String[] args, Event event) { | ||
// Supports length(String) and length(JsonPointer) | ||
// For example., length("abcd"), length("/keyName") | ||
if (functionName.equals("length")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should take a factory-based approach similar to what we did for operators. See the OperatorProvider
class and how it is used.
At a high-level, The OperatorProvider
takes a List<Operator<?>>
as input. Spring will inject this list. It builds it up from all the actual Operator
implementations (see AndOperator
as one example). Because they are @Named
, they are injected into the application context and thus become part of the List<Operator<?>>
.
I think the interface might look something like the following:
interface ExpressionFunction<T> {
int getNumberOfArguments();
String getFunctionName();
T evaluate(final Object ... args);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dlvenable Not sure if this suggestion works for functions unless we plan to put function names in the grammar. Is that what you are suggesting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed offline. No, this will not be a change to the grammar. This will use the Spring application context to get the List<ExpressionFunction>
.
A class like this should work:
@Named
public class ExpressionFunctionProvider {
private final Map<String, ExpressionFunction> functionsMap;
@Inject
public ExpressionFunctionProvider(List<ExpressionFunction> expressionFunctions) {
functionsMap = expressionFunctions.stream().collect(Collectors.toMap(e -> e.getFunctionName(), e -> e));
}
public Object provideFunction(String functionName, String[] args, Event event) {
return functionsMap.get(functionName);
}
}
This should also be resolving: #2639 |
…rovider and implementation Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
Codecov Report
@@ Coverage Diff @@
## main #2644 +/- ##
============================================
+ Coverage 93.45% 93.57% +0.12%
- Complexity 2185 2238 +53
============================================
Files 258 261 +3
Lines 6107 6275 +168
Branches 497 519 +22
============================================
+ Hits 5707 5872 +165
+ Misses 268 266 -2
- Partials 132 137 +5
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes should have a small integration test within the data-prepper-expression package to verify behavior end-to-end.
You may be able to get some ideas on this from ConditionalExpressionEvaluatorIT
.
if (args.size() != 1) { | ||
throw new RuntimeException("length() takes only one argument"); | ||
} | ||
Object arg = args.get(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently evaluating a string directly, right?
In the description for #2639, we are looking to evaluate a JSON Path within an Event.
With the current implementation:
length("/log_level")
Output: 10
I believe we'd want this to evaluate as such:
Event:
{"log_level": "INFO"}
Output 4
Event:
{"log_level": "ERROR"}
Output 5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. The JsonPointers are already resolved before passing them to this function. So, it would work as you mentioned.
* @return the result of function evaluation | ||
* @since 2.3 | ||
*/ | ||
Object evaluate(final List<Object> args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this include Event
? See my comments below on how to evaluate length
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. JsonPointers are already resolved when this is called.
return new LengthExpressionFunction(); | ||
} | ||
|
||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd make this an @ParameterizedTest
with a few values: 0, 1, 2, 5, 20.
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
List<Object> argList = new ArrayList<>(); | ||
for (final String arg: args) { | ||
String trimmedArg = arg.trim(); | ||
if (trimmedArg.charAt(0) == '/') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I see now that this is where you resolve the JSON Pointer.
I think we should let the function decide which arguments are JSON Pointers and which are not. For example, the length
function should state that the first argument is a JSON Pointer.
Also, we generally support a simpler JSON Pointer syntax without the leading slash if you are getting from the root object. So, this logic will not work for this situation.
Also, see that for CIDR checks, we want to include arguments which are not part of the event. #2625 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate more on what is "simpler JSON Pointer"? It looks like grammar only supports JSON pointer with forward slash.
Regarding CIDR checks, Could you provide some example usecases? Are you referring to issues like #2625? To any function that may be dealing with CIDR blocks, it would get a string argument like 192.168.10.*
and it is up to the function to convert it to network addresses. The grammar that I defined in this PR, only allows string or JsonPointer arguments. JsonPointer argument will be looked in the event and the resultant value can be of any "type". That's why the argument list is a list of Objects and not list of Strings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thoughts, I think we need to pass event
to evaluate
because we may have to use the string arguments as an argument to event tags or event metadata but resolving JsonPointer in ParseTreeCoercionService still makes sense because it avoids code duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate more on what is "simpler JSON Pointer"? It looks like grammar only supports JSON pointer with forward slash.
Yes, that is what I meant. I'm sorry about the confusion. I know we support this in the Event model (see this test case), but perhaps not in the grammar.
I think that the function itself needs to know what its arguments are and how to interpret them.
Consider the possibility of a hasSubstring
method defined as hasSubstring(eventPointer, substring)
.
We can't be sure how pipeline authors would use it. Perhaps they have the following:
hasSubstring("/file", "/usr/var/linux/path");
This is a reasonable scenario. However, the current code would interpret that the second argument should be a JSON Pointer.
On second thoughts, I think we need to pass event to evaluate because we may have to use the string arguments as an argument to event tags or event metadata
This is a good point. Indeed, getTags()
will have not arguments (at least as initially planned).
but resolving JsonPointer in ParseTreeCoercionService still makes sense because it avoids code duplication.
I think we should look to see how we can generalize resolveJsonPointerValue
. One simple approach could be to also pass a BiFunction<String, Event, Object>
into the evaluate
method.
return expressionFunctionProvider.provideFunction(functionName, event, argList, this::resolveJsonPointerValue)
Another option is an Event
decorator so that we could call Event::get
, but that might be a larger undertaking.
@@ -160,11 +165,16 @@ private static Stream<Arguments> validExpressionArguments() { | |||
complexEvent(ALL_JACKSON_EVENT_GET_SUPPORTED_CHARACTERS, true), | |||
true), | |||
Arguments.of("/durationInNanos > 5000000000", event("{\"durationInNanos\": 6000000000}"), true), | |||
Arguments.of("/response == \"OK\"", event("{\"response\": \"OK\"}"), true) | |||
Arguments.of("/response == \"OK\"", event("{\"response\": \"OK\"}"), true), | |||
Arguments.of("length(/response) == "+testStringLength, event("{\"response\": \""+testString+"\"}"), true), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I'm glad this works.
Also, there is a test failure:
|
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
|
||
public Object provideFunction(final String functionName, final List<Object> argList) { | ||
if (!expressionFunctionsMap.containsKey(functionName)) { | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning null is an anti pattern.
I would recommend we throw and exception here that the function does not exist rather than rely on null checks or have to debug a NPE down the road.
import javax.inject.Named; | ||
|
||
@Named | ||
public class LengthExpressionFunction implements ExpressionFunction { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's be sure to add documentation for this new feature here: https://github.com/opensearch-project/data-prepper/blob/main/docs/expression_syntax.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also create a GitHub issue in this repo for us to update the syntax in the user-facing documents.
https://github.com/opensearch-project/documentation-website
@@ -183,7 +193,8 @@ private static Stream<Arguments> invalidExpressionArguments() { | |||
Arguments.of("not null", event("{}")), | |||
Arguments.of("not/status_code", event("{\"status_code\": 200}")), | |||
Arguments.of("trueand/status_code", event("{\"status_code\": 200}")), | |||
Arguments.of("trueor/status_code", event("{\"status_code\": 200}")) | |||
Arguments.of("trueor/status_code", event("{\"status_code\": 200}")), | |||
Arguments.of("length(\""+testString+") == "+testStringLength, event("{\"response\": \""+testString+"\"}"), true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test which uses this list of args takes two arguments. You have three on this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed it. Thanks!
if (argStr.charAt(0) == '\"') { | ||
if (argStr.charAt(argStr.length()-1) != '\"') { | ||
throw new RuntimeException("Invalid string passed to length() "+argStr); | ||
} | ||
return Integer.valueOf(argStr.length()-2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does excluding the extra quotes make sense to the user excluding only the first set of extra quotes? How would this work for string data like: "\"\"\"\"foobar\"\"\"\""
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should work like Java Strings. If the input is "abcd", the length would be 4. If the input is ""abcd"", the length would be 6.
…e common infra Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appear to be some situations not covered by the tests and the build is failing.
[ant:jacocoReport] Rule violated for bundle data-prepper-expression: instructions covered ratio is 0.9, but expected minimum is 1.0
Please address those and the comment from @cmanning09 . Aside from that, this looks good to me.
if (argStr.length() == 0) { | ||
return 0; | ||
} | ||
if (argStr.charAt(0) == '\"') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something we should support?
I'm not sure it adds a lot of value. And it could lead to confusing situations for users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to think we should start simply by assuming that the first (and only) argument must be a JSON Pointer. If it is not, the expression should fail.
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
…er expressions Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work on this!
docs/expression_syntax.md
Outdated
Currently the following functions are supported | ||
* `length` | ||
- takes one argument of string or JsonPointer type | ||
- returns the length of the string if the input is of the string type. For example, `length("abcd")` returns 4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not supporting literals at the moment. The argument must be a jsonPointer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. I have updated the doc.
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
Description
Adds supports for functions in expressions.
Only length() function is supported initially.
Resolves #2626 #2639
Issues Resolved
#2626
#2639
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.