Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing the backward compatible issue that existing metadata may contain unescaped characters #12393

Merged
merged 1 commit into from
Feb 10, 2024

Conversation

xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Feb 10, 2024

Observed the issue that existing metadata with escaped characters failed the segment metadata loading.
Introduced by #11985.

Stacktrace:

org.apache.commons.lang.exception.NestableRuntimeException: Unable to parse unicode value: tils

	at org.apache.commons.lang.StringEscapeUtils.unescapeJava(StringEscapeUtils.java:337)
	at org.apache.commons.lang.StringEscapeUtils.unescapeJava(StringEscapeUtils.java:287)
	at org.apache.pinot.spi.env.CommonsConfigurationUtils.recoverSpecialCharacterInPropertyValue(CommonsConfigurationUtils.java:245)
	at org.apache.pinot.segment.spi.index.metadata.ColumnMetadataImpl.fromPropertiesConfiguration(ColumnMetadataImpl.java:284)
	at org.apache.pinot.segment.local.segment.index.ColumnMetadataTest.testMetadataWithEscapedValue(ColumnMetadataTest.java:225)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:139)
	at org.testng.internal.invokers.TestInvoker.invokeMethod(TestInvoker.java:664)
	at org.testng.internal.invokers.TestInvoker.invokeTestMethod(TestInvoker.java:227)
	at org.testng.internal.invokers.MethodRunner.runInSequence(MethodRunner.java:50)
	at org.testng.internal.invokers.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:957)
	at org.testng.internal.invokers.TestInvoker.invokeTestMethods(TestInvoker.java:200)
	at org.testng.internal.invokers.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:148)
	at org.testng.internal.invokers.TestMethodWorker.run(TestMethodWorker.java:128)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
	at org.testng.TestRunner.privateRun(TestRunner.java:848)
	at org.testng.TestRunner.run(TestRunner.java:621)
	at org.testng.SuiteRunner.runTest(SuiteRunner.java:443)
	at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:437)
	at org.testng.SuiteRunner.privateRun(SuiteRunner.java:397)
	at org.testng.SuiteRunner.run(SuiteRunner.java:336)
	at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
	at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:95)
	at org.testng.TestNG.runSuitesSequentially(TestNG.java:1280)
	at org.testng.TestNG.runSuitesLocally(TestNG.java:1200)
	at org.testng.TestNG.runSuites(TestNG.java:1114)
	at org.testng.TestNG.run(TestNG.java:1082)
	at com.intellij.rt.testng.IDEARemoteTestNG.run(IDEARemoteTestNG.java:65)
	at com.intellij.rt.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:105)
Caused by: java.lang.NumberFormatException: For input string: "tils"
	at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
	at java.base/java.lang.Integer.parseInt(Integer.java:652)
	at org.apache.commons.lang.StringEscapeUtils.unescapeJava(StringEscapeUtils.java:331)
	... 31 more

@xiangfu0 xiangfu0 force-pushed the fixing-metadata-escape-bug branch 3 times, most recently from c43f366 to 6fad229 Compare February 10, 2024 01:40
try {
value = StringEscapeUtils.unescapeJava(value);
} catch (Exception e) {
// If the value is not a valid escaped string, ignore the exception and continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we log an error message/exception here ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to log error here. If we want to find all old segments that contains unescaped special character, a metric might be helpful.
IMO we can just leave it as is since the overhead is minimal.

@@ -241,7 +241,11 @@ public static String replaceSpecialCharacterInPropertyValue(String value) {
* {@link #replaceSpecialCharacterInPropertyValue(String)}.
*/
public static String recoverSpecialCharacterInPropertyValue(String value) {
value = StringEscapeUtils.unescapeJava(value);
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw a similar change in another method (introduced by the same PR). I did not see similar issue, but wanted to just point out.

 public static String replaceSpecialCharacterInPropertyValue(String value) {
    value = StringEscapeUtils.**escapeJava**(value);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some comments stating this is for backward compatibility, where the old commons library escapes character in a different way

@@ -241,7 +241,11 @@ public static String replaceSpecialCharacterInPropertyValue(String value) {
* {@link #replaceSpecialCharacterInPropertyValue(String)}.
*/
public static String recoverSpecialCharacterInPropertyValue(String value) {
value = StringEscapeUtils.unescapeJava(value);
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some comments stating this is for backward compatibility, where the old commons library escapes character in a different way

try {
value = StringEscapeUtils.unescapeJava(value);
} catch (Exception e) {
// If the value is not a valid escaped string, ignore the exception and continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to log error here. If we want to find all old segments that contains unescaped special character, a metric might be helpful.
IMO we can just leave it as is since the overhead is minimal.

@codecov-commenter
Copy link

codecov-commenter commented Feb 10, 2024

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (2b69d6a) 61.74% compared to head (bfa4fd0) 61.72%.
Report is 3 commits behind head on master.

Files Patch % Lines
...pache/pinot/spi/env/CommonsConfigurationUtils.java 66.66% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #12393      +/-   ##
============================================
- Coverage     61.74%   61.72%   -0.03%     
  Complexity      207      207              
============================================
  Files          2425     2428       +3     
  Lines        132766   132828      +62     
  Branches      20535    20545      +10     
============================================
+ Hits          81980    81990      +10     
- Misses        44771    44833      +62     
+ Partials       6015     6005      -10     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (ø)
integration <0.01% <0.00%> (ø)
integration1 <0.01% <0.00%> (ø)
integration2 0.00% <0.00%> (ø)
java-11 34.94% <66.66%> (-26.74%) ⬇️
java-21 61.61% <66.66%> (-0.02%) ⬇️
skip-bytebuffers-false 61.69% <66.66%> (-0.04%) ⬇️
skip-bytebuffers-true 61.58% <66.66%> (-0.01%) ⬇️
temurin 61.72% <66.66%> (-0.03%) ⬇️
unittests 61.72% <66.66%> (-0.03%) ⬇️
unittests1 46.92% <66.66%> (+0.02%) ⬆️
unittests2 27.70% <0.00%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@xiangfu0 xiangfu0 merged commit 43dadbf into apache:master Feb 10, 2024
19 checks passed
@xiangfu0 xiangfu0 deleted the fixing-metadata-escape-bug branch February 10, 2024 04:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants