-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support functions in Data Prepper expressions #2626 #2644
Changes from 3 commits
292e006
1093afa
12a6896
03db5c7
d6c680b
49b09b0
498df1f
8199b93
3678327
1b3d76d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
/* | ||
* Copyright OpenSearch Contributors | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.opensearch.dataprepper.expression; | ||
|
||
import java.util.List; | ||
|
||
interface ExpressionFunction { | ||
/** | ||
* @return function name | ||
* @since 2.3 | ||
*/ | ||
String getFunctionName(); | ||
|
||
/** | ||
* evaluates the function and returns the result | ||
* @param args list of arguments to the function | ||
* @return the result of function evaluation | ||
* @since 2.3 | ||
*/ | ||
Object evaluate(final List<Object> args); | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
/* | ||
* Copyright OpenSearch Contributors | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.opensearch.dataprepper.expression; | ||
|
||
import javax.inject.Named; | ||
import javax.inject.Inject; | ||
import java.util.Map; | ||
import java.util.List; | ||
import java.util.stream.Collectors; | ||
|
||
@Named | ||
public class ExpressionFunctionProvider { | ||
private Map<String, ExpressionFunction> expressionFunctionsMap; | ||
|
||
@Inject | ||
public ExpressionFunctionProvider(final List<ExpressionFunction> expressionFunctions) { | ||
expressionFunctionsMap = expressionFunctions.stream().collect(Collectors.toMap(e -> e.getFunctionName(), e -> e)); | ||
} | ||
|
||
public Object provideFunction(final String functionName, final List<Object> argList) { | ||
if (!expressionFunctionsMap.containsKey(functionName)) { | ||
return null; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Returning null is an anti pattern. I would recommend we throw and exception here that the function does not exist rather than rely on null checks or have to debug a NPE down the road. |
||
} | ||
return expressionFunctionsMap.get(functionName).evaluate(argList); | ||
} | ||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
/* | ||
* Copyright OpenSearch Contributors | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.opensearch.dataprepper.expression; | ||
|
||
import java.util.List; | ||
import javax.inject.Named; | ||
|
||
@Named | ||
public class LengthExpressionFunction implements ExpressionFunction { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's be sure to add documentation for this new feature here: https://github.com/opensearch-project/data-prepper/blob/main/docs/expression_syntax.md There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please also create a GitHub issue in this repo for us to update the syntax in the user-facing documents. https://github.com/opensearch-project/documentation-website |
||
public String getFunctionName() { | ||
return "length"; | ||
} | ||
|
||
public Object evaluate(final List<Object> args) { | ||
if (args.size() != 1) { | ||
throw new RuntimeException("length() takes only one argument"); | ||
} | ||
Object arg = args.get(0); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is currently evaluating a string directly, right? In the description for #2639, we are looking to evaluate a JSON Path within an Event. With the current implementation:
Output: I believe we'd want this to evaluate as such: Event:
Output Event:
Output There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No. The JsonPointers are already resolved before passing them to this function. So, it would work as you mentioned. |
||
if (!(arg instanceof String)) { | ||
throw new RuntimeException("length() takes only String type arguments"); | ||
} | ||
String argStr = ((String)arg).trim(); | ||
if (argStr.charAt(0) == '\"') { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this something we should support? I'm not sure it adds a lot of value. And it could lead to confusing situations for users. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tend to think we should start simply by assuming that the first (and only) argument must be a JSON Pointer. If it is not, the expression should fail. |
||
if (argStr.charAt(argStr.length()-1) != '\"') { | ||
throw new RuntimeException("Invalid string passed to length() "+argStr); | ||
} | ||
return Integer.valueOf(argStr.length()-2); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does excluding the extra quotes make sense to the user excluding only the first set of extra quotes? How would this work for string data like: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should work like Java Strings. If the input is "abcd", the length would be 4. If the input is ""abcd"", the length would be 6. |
||
} else { | ||
return Integer.valueOf(argStr.length()); | ||
} | ||
} | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,21 +13,43 @@ | |
import javax.inject.Named; | ||
import java.io.Serializable; | ||
import java.util.Map; | ||
import java.util.List; | ||
import java.util.ArrayList; | ||
import java.util.function.Function; | ||
|
||
@Named | ||
class ParseTreeCoercionService { | ||
private final Map<Class<? extends Serializable>, Function<Object, Object>> literalTypeConversions; | ||
private ExpressionFunctionProvider expressionFunctionProvider; | ||
|
||
@Inject | ||
public ParseTreeCoercionService(final Map<Class<? extends Serializable>, Function<Object, Object>> literalTypeConversions) { | ||
public ParseTreeCoercionService(final Map<Class<? extends Serializable>, Function<Object, Object>> literalTypeConversions, ExpressionFunctionProvider expressionFunctionProvider) { | ||
this.literalTypeConversions = literalTypeConversions; | ||
this.expressionFunctionProvider = expressionFunctionProvider; | ||
} | ||
|
||
public Object coercePrimaryTerminalNode(final TerminalNode node, final Event event) { | ||
final int nodeType = node.getSymbol().getType(); | ||
final String nodeStringValue = node.getText(); | ||
switch (nodeType) { | ||
case DataPrepperExpressionParser.Function: | ||
final int funcNameIndex = nodeStringValue.indexOf("("); | ||
final String functionName = nodeStringValue.substring(0, funcNameIndex); | ||
final int argsEndIndex = nodeStringValue.indexOf(")", funcNameIndex); | ||
final String argsStr = nodeStringValue.substring(funcNameIndex+1, argsEndIndex); | ||
final String[] args = argsStr.split(","); | ||
List<Object> argList = new ArrayList<>(); | ||
for (final String arg: args) { | ||
String trimmedArg = arg.trim(); | ||
if (trimmedArg.charAt(0) == '/') { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK. I see now that this is where you resolve the JSON Pointer. I think we should let the function decide which arguments are JSON Pointers and which are not. For example, the Also, we generally support a simpler JSON Pointer syntax without the leading slash if you are getting from the root object. So, this logic will not work for this situation. Also, see that for CIDR checks, we want to include arguments which are not part of the event. #2625 (comment) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you elaborate more on what is "simpler JSON Pointer"? It looks like grammar only supports JSON pointer with forward slash. Regarding CIDR checks, Could you provide some example usecases? Are you referring to issues like #2625? To any function that may be dealing with CIDR blocks, it would get a string argument like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On second thoughts, I think we need to pass There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, that is what I meant. I'm sorry about the confusion. I know we support this in the Event model (see this test case), but perhaps not in the grammar. I think that the function itself needs to know what its arguments are and how to interpret them. Consider the possibility of a We can't be sure how pipeline authors would use it. Perhaps they have the following:
This is a reasonable scenario. However, the current code would interpret that the second argument should be a JSON Pointer.
This is a good point. Indeed,
I think we should look to see how we can generalize
Another option is an |
||
argList.add(resolveJsonPointerValue(arg, event)); | ||
} else if (trimmedArg.charAt(0) == '"') { | ||
argList.add(trimmedArg); | ||
} else { | ||
throw new RuntimeException("Unsupported type passed as function argument"); | ||
} | ||
} | ||
return expressionFunctionProvider.provideFunction(functionName, argList); | ||
case DataPrepperExpressionParser.EscapedJsonPointer: | ||
final String jsonPointerWithoutQuotes = nodeStringValue.substring(1, nodeStringValue.length() - 1); | ||
return resolveJsonPointerValue(jsonPointerWithoutQuotes, event); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
/* | ||
* Copyright OpenSearch Contributors | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.opensearch.dataprepper.expression; | ||
|
||
import org.junit.jupiter.api.Test; | ||
import org.junit.jupiter.api.extension.ExtendWith; | ||
import org.mockito.junit.jupiter.MockitoExtension; | ||
import org.apache.commons.lang3.RandomStringUtils; | ||
import static org.hamcrest.CoreMatchers.equalTo; | ||
import static org.hamcrest.MatcherAssert.assertThat; | ||
import static org.mockito.ArgumentMatchers.any; | ||
import static org.mockito.Mockito.mock; | ||
import static org.mockito.Mockito.lenient; | ||
|
||
import java.util.List; | ||
|
||
@ExtendWith(MockitoExtension.class) | ||
class ExpressionFunctionProviderTest { | ||
private ExpressionFunctionProvider objectUnderTest; | ||
private String testFunctionName; | ||
private Object testResultObject; | ||
private ExpressionFunction expressionFunction; | ||
|
||
public ExpressionFunctionProvider createObjectUnderTest() { | ||
expressionFunction = mock(ExpressionFunction.class); | ||
testFunctionName = RandomStringUtils.randomAlphabetic(8); | ||
testResultObject = mock(Object.class); | ||
lenient().when(expressionFunction.evaluate(any(List.class))).thenReturn(testResultObject); | ||
lenient().when(expressionFunction.getFunctionName()).thenReturn(testFunctionName); | ||
|
||
return new ExpressionFunctionProvider(List.of(expressionFunction)); | ||
} | ||
|
||
@Test | ||
void testUnknownFunction() { | ||
objectUnderTest = createObjectUnderTest(); | ||
String unknownFunctionName = RandomStringUtils.randomAlphabetic(8); | ||
assertThat(objectUnderTest.provideFunction(unknownFunctionName, List.of()), equalTo(null)); | ||
} | ||
|
||
@Test | ||
void testFunctionBasic() { | ||
objectUnderTest = createObjectUnderTest(); | ||
assertThat(objectUnderTest.provideFunction(testFunctionName, List.of()), equalTo(testResultObject)); | ||
} | ||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
/* | ||
* Copyright OpenSearch Contributors | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.opensearch.dataprepper.expression; | ||
|
||
import org.junit.jupiter.api.Test; | ||
import static org.hamcrest.CoreMatchers.equalTo; | ||
import static org.hamcrest.MatcherAssert.assertThat; | ||
import static org.junit.jupiter.api.Assertions.assertThrows; | ||
import org.apache.commons.lang3.RandomStringUtils; | ||
|
||
import java.util.List; | ||
|
||
class LengthExpressionFunctionTest { | ||
private LengthExpressionFunction lengthExpressionFunction; | ||
|
||
public LengthExpressionFunction createObjectUnderTest() { | ||
return new LengthExpressionFunction(); | ||
} | ||
|
||
@Test | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd make this an |
||
void testWithOneStringArgumentWithOutQuotes() { | ||
lengthExpressionFunction = createObjectUnderTest(); | ||
String testString = RandomStringUtils.randomAlphabetic(5); | ||
assertThat(lengthExpressionFunction.evaluate(List.of(testString)), equalTo(testString.length())); | ||
} | ||
|
||
@Test | ||
void testWithOneStringArgumentWithQuotes() { | ||
lengthExpressionFunction = createObjectUnderTest(); | ||
String testString = RandomStringUtils.randomAlphabetic(5); | ||
assertThat(lengthExpressionFunction.evaluate(List.of("\""+testString + "\"")), equalTo(testString.length())); | ||
} | ||
|
||
@Test | ||
void testWithTwoArgs() { | ||
lengthExpressionFunction = createObjectUnderTest(); | ||
assertThrows(RuntimeException.class, () -> lengthExpressionFunction.evaluate(List.of("arg1", "arg2"))); | ||
} | ||
|
||
@Test | ||
void testWithNonStringArgument() { | ||
lengthExpressionFunction = createObjectUnderTest(); | ||
assertThrows(RuntimeException.class, () -> lengthExpressionFunction.evaluate(List.of(10))); | ||
} | ||
|
||
@Test | ||
void testWithInvalidStringArgument() { | ||
lengthExpressionFunction = createObjectUnderTest(); | ||
assertThrows(RuntimeException.class, () -> lengthExpressionFunction.evaluate(List.of("\"arg1"))); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this include
Event
? See my comments below on how to evaluatelength
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. JsonPointers are already resolved when this is called.