Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic operator support to arithmetic and string expressions #2726

Merged
merged 7 commits into from
May 25, 2023

Conversation

kkondaka
Copy link
Collaborator

@kkondaka kkondaka commented May 21, 2023

Description

Adds support for basic operators to arithmetic and string expressions.
Arithmetic operators supported are +, -, *, /

Some example arithmetic expressions supported with this change are
/key1 + /key2 +5
length(/message) - getMetadata("attr") + 10
10.2 + 20 * 6.5
(10 + 20) * 6

String operator supported + for concatenation.
Example expressions with string concatenation operator
/key1 + /key2
/message + "suffix"
getMetadata("attr") + /message

Resolves #2685 #2686

Issues Resolved

Resolves #2685 #2686

Check List

  • [X ] New functionality includes testing.
  • [X ] New functionality has been documented.
    • New functionality has javadoc added
  • [X ] Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
@@ -17,16 +17,28 @@ expression
;

stringExpression
: function
: stringExpression DOT stringExpression
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we not able to reuse + for both strings and arithmetic?

…ectly

Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
| arithmeticTerm
;

arithmeticTerm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you call this a term?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the following grammar from googling

<Exp> ::= <Exp> + <Term> |
<Exp> - <Term> |
<Term>

<Term> ::= <Term> * <Factor> |
<Term> / <Factor> |
<Factor>

<Factor> ::= x | y | ... |
( <Exp> ) |
- <Factor>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should probably find another name. A term is more generic than two numbers multiplied together.

https://simple.wikipedia.org/wiki/Term_(mathematics)

There might be better names, but perhaps you can split this into a "factor term" and a "summed term?"

@@ -8,7 +8,9 @@
import org.antlr.v4.runtime.RuleContext;

interface Operator<T> {
int getNumberOfOperands();
default int getNumberOfOperands(final RuleContext ctx) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like this.

Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
@codecov
Copy link

codecov bot commented May 23, 2023

Codecov Report

Merging #2726 (c04dbf7) into main (f2c9341) will increase coverage by 0.50%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##               main    #2726      +/-   ##
============================================
+ Coverage     93.60%   94.11%   +0.50%     
- Complexity     2270     2439     +169     
============================================
  Files           264      278      +14     
  Lines          6337     6744     +407     
  Branches        528      550      +22     
============================================
+ Hits           5932     6347     +415     
+ Misses          266      257       -9     
- Partials        139      140       +1     
Impacted Files Coverage Δ
...opensearch/dataprepper/expression/AndOperator.java 100.00% <ø> (ø)
...h/dataprepper/expression/GenericInSetOperator.java 100.00% <ø> (ø)
...aprepper/expression/GenericRegexMatchOperator.java 100.00% <ø> (ø)
...opensearch/dataprepper/expression/NotOperator.java 100.00% <ø> (ø)
...dataprepper/expression/NumericCompareOperator.java 100.00% <ø> (ø)
.../opensearch/dataprepper/expression/OrOperator.java 100.00% <ø> (ø)
.../dataprepper/model/event/DefaultEventMetadata.java 92.00% <100.00%> (+0.33%) ⬆️
...rch/dataprepper/event/DefaultBaseEventBuilder.java 100.00% <100.00%> (ø)
...arch/dataprepper/expression/AddBinaryOperator.java 100.00% <100.00%> (ø)
...taprepper/expression/ArithmeticBinaryOperator.java 100.00% <100.00%> (ø)
... and 6 more

... and 29 files with indirect coverage changes

@@ -18,11 +18,6 @@ protected BinaryOperator(final int symbol, final int shouldEvaluateRuleIndex) {
displayName = DataPrepperExpressionParser.VOCABULARY.getDisplayName(symbol);
}

@Override
public int getNumberOfOperands() {
return NUMBER_OF_OPERANDS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove line 9 in this file now.

We should leverage a single source of truth for a class. The evaluate method uses a hardcoded number to enforce the number of operands. Can we update the evaluate method to leverage the default method now as well?

Map.of(
Integer.class, (lhs, rhs) -> (double)lhs / (double)(int)rhs,
Float.class, (lhs, rhs) -> (double)lhs / (Float) rhs,
Long.class, (lhs, rhs) -> (double)lhs / (double)(long)rhs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the lhs cast to only a double where the rhs is cast to a long and then a double?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because LHS is always double (final Map<Class<? extends Number>, BiFunction<Object, Object, Number>> doubleOperations) and RHS has entries for 4 different types.

Comment on lines 116 to 121
if (expectedClass == null) {
assertThrows(ExpressionEvaluationException.class, () -> evaluator.evaluate(expression, event));
} else {
Object result = evaluator.evaluate(expression, event);
assertThat(result, not(instanceOf(expectedClass)));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should separate out these tests. These are two separate behaviors that should be handled separately.

Comment on lines 9 to 11
| 3 | `*`, `/` | Multiple and Divison Operators | left-to-right |
| 2 | `==`, `!=` | Equality Operators | left-to-right |
| 2 | `+`, `-` | Addition and Subtraction Operators | left-to-right |
Copy link
Contributor

@cmanning09 cmanning09 May 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we place these operations on separate levels from the the relational and equality operators? I believe we need to evaluate these operators before we can compare relataional or equality operators.

At the moment according to this documentation the following statement would result in an error with the current priority order: 2 + 4 > 5

Please confirm the priority order for these new operators and the behavior.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes do not support expressions like 2 + 4 > 5 for now. It is a relatively simple change from here but did not include it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we don't support that expression type right now the documentation is miss leading and violates the standard order of operations. We need to correct this.

| `==`, `!=` | Equality Operators | No | `/status == 200`<br>`/status_code==200` | |
| `and`, `or`, `not` | Conditional Operators | Yes | `/a<300 and /b>200` | `/b<300and/b>200` |
| `,` | Set Value Delimiter | No | `/a in {200, 202}`<br>`/a in {200,202}`<br>`/a in {200 , 202}` | `/a in {200,}` |
| `+`, `-` | Add and Subtract Operators | No | `/status_code + length(/message) - 2` | |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we indicate + support concatenation as well?

Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
graytaylor0
graytaylor0 previously approved these changes May 24, 2023
| arithmeticFactor
;

arithmeticFactor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might this be the arithmeticTerm?

Then what you have as arithmeticTerm could be an arithmeticFactor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This page for the C grammar calls these "multiplicative_expression" and "additive_expression".

https://www.lysator.liu.se/c/ANSI-C-grammar-y.html

Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
cmanning09
cmanning09 previously approved these changes May 25, 2023
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
Copy link
Member

@dlvenable dlvenable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks great!

@kkondaka kkondaka deleted the expr-operators branch July 13, 2023 04:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for Arithmetic expressions in DataPrepper expression
4 participants