Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add REGEXP_REPLACE function. #14460

Merged
merged 7 commits into from
Jun 29, 2023
Merged

Add REGEXP_REPLACE function. #14460

merged 7 commits into from
Jun 29, 2023

Conversation

gianm
Copy link
Contributor

@gianm gianm commented Jun 21, 2023

Replaces all instances of a pattern with a replacement string.

Replaces all instances of a pattern with a replacement string.
if (s == null || pattern == null || replacement == null) {
return ExprEval.of(null);
} else {
final Matcher matcher = Pattern.compile(pattern).matcher(s);

Check failure

Code scanning / CodeQL

Regular expression injection

This regular expression is constructed from a [user-provided value](1). This regular expression is constructed from a [user-provided value](2). This regular expression is constructed from a [user-provided value](3).
Comment on lines +305 to +314
testHelper.testExpressionString(
new RegexpReplaceOperatorConversion().calciteOperator(),
ImmutableList.of(
testHelper.makeInputRef("s"),
testHelper.makeLiteral("x(.)"),
testHelper.makeLiteral("z")
),
makeExpression("regexp_replace(\"s\",'x(.)','z')"),
"foo"
);

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [ExpressionTestHelper.testExpressionString](1) should be avoided because it has been deprecated.
testHelper.makeLiteral("x(.)"),
testHelper.makeLiteral("z")
),
makeExpression("regexp_replace(\"s\",'x(.)','z')"),

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [CalciteTestBase.makeExpression](1) should be avoided because it has been deprecated.
Comment on lines +316 to +325
testHelper.testExpressionString(
new RegexpReplaceOperatorConversion().calciteOperator(),
ImmutableList.of(
testHelper.makeInputRef("s"),
testHelper.makeLiteral("(o)"),
testHelper.makeLiteral("z")
),
makeExpression("regexp_replace(\"s\",'(o)','z')"),
"fzz"
);

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [ExpressionTestHelper.testExpressionString](1) should be avoided because it has been deprecated.
testHelper.makeLiteral("(o)"),
testHelper.makeLiteral("z")
),
makeExpression("regexp_replace(\"s\",'(o)','z')"),

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [CalciteTestBase.makeExpression](1) should be avoided because it has been deprecated.
Comment on lines +327 to +340
testHelper.testExpressionString(
new RegexpReplaceOperatorConversion().calciteOperator(),
ImmutableList.of(
testHelper.makeCall(
SqlStdOperatorTable.CONCAT,
testHelper.makeLiteral("Z"),
testHelper.makeInputRef("s")
),
testHelper.makeLiteral("Zf(.)"),
testHelper.makeLiteral("z")
),
makeExpression("regexp_replace(concat('Z',\"s\"),'Zf(.)','z')"),
"zo"
);

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [ExpressionTestHelper.testExpressionString](1) should be avoided because it has been deprecated.
testHelper.makeLiteral("Zf(.)"),
testHelper.makeLiteral("z")
),
makeExpression("regexp_replace(concat('Z',\"s\"),'Zf(.)','z')"),

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [CalciteTestBase.makeExpression](1) should be avoided because it has been deprecated.
Comment on lines +342 to +351
testHelper.testExpressionString(
new RegexpReplaceOperatorConversion().calciteOperator(),
ImmutableList.of(
testHelper.makeInputRef("s"),
testHelper.makeLiteral("f(.)"),
testHelper.makeLiteral("$1")
),
makeExpression("regexp_replace(\"s\",'f(.)','$1')"),
"oo"
);

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [ExpressionTestHelper.testExpressionString](1) should be avoided because it has been deprecated.
testHelper.makeLiteral("f(.)"),
testHelper.makeLiteral("$1")
),
makeExpression("regexp_replace(\"s\",'f(.)','$1')"),

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [CalciteTestBase.makeExpression](1) should be avoided because it has been deprecated.
@vogievetsky
Copy link
Contributor

Beast.

final String patternString = NullHandling.nullToEmptyIfNeeded((String) patternExpr.getLiteralValue());

this.arg = args.get(0);
this.pattern = patternString != null ? Pattern.compile(patternString) : null;

Check failure

Code scanning / CodeQL

Regular expression injection

This regular expression is constructed from a [user-provided value](1). This regular expression is constructed from a [user-provided value](2). This regular expression is constructed from a [user-provided value](3). This regular expression is constructed from a [user-provided value](4). This regular expression is constructed from a [user-provided value](5).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a real problem, but.. I'm not really sure the best way to solve the DoS problem since we do want them to be able to provide a pattern, and it can happen in other places where regex are provided by user queries too.

Copy link
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

final String patternString = NullHandling.nullToEmptyIfNeeded((String) patternExpr.getLiteralValue());

this.arg = args.get(0);
this.pattern = patternString != null ? Pattern.compile(patternString) : null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a real problem, but.. I'm not really sure the best way to solve the DoS problem since we do want them to be able to provide a pattern, and it can happen in other places where regex are provided by user queries too.

Comment on lines +37 to +40
private static final SqlFunction SQL_FUNCTION = OperatorConversions
.operatorBuilder("REGEXP_REPLACE")
.operandTypes(SqlTypeFamily.CHARACTER, SqlTypeFamily.CHARACTER, SqlTypeFamily.CHARACTER)
.requiredOperands(3)

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [OperatorBuilder.requiredOperands](1) should be avoided because it has been deprecated.
@gianm
Copy link
Contributor Author

gianm commented Jun 29, 2023

Something got wedged with the test [standard-its / (Compile=openjdk8, Run=openjdk8, Cluster Build On K8s) ITNestedQueryPushDownTest integration test](https://github.com/apache/druid/actions/runs/5413896842/jobs/9845501899?pr=14460#logs).

It's not able to build a docker image due to a problem with busybox --install. I tried rerunning it a few times and it still fails. I suspect it's something build-cache-related, because I don't see how this patch would be related, especially given that all the other ITs passed. So, I'm going to merge this patch anyway.

@gianm gianm merged commit e10e35a into apache:master Jun 29, 2023
@gianm gianm deleted the regexp-replace branch June 29, 2023 20:48
@abhishekagarwal87 abhishekagarwal87 added this to the 27.0 milestone Jul 19, 2023
sergioferragut pushed a commit to sergioferragut/druid that referenced this pull request Jul 21, 2023
* Add REGEXP_REPLACE function.

Replaces all instances of a pattern with a replacement string.

* Fixes.

* Improve test coverage.

* Adjust behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants