-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
skip invalid json string rather than throwing error during json indexing #12238
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #12238 +/- ##
============================================
+ Coverage 61.58% 61.73% +0.14%
+ Complexity 1153 207 -946
============================================
Files 2415 2425 +10
Lines 131502 132728 +1226
Branches 20310 20533 +223
============================================
+ Hits 80988 81940 +952
- Misses 44587 44770 +183
- Partials 5927 6018 +91
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
JsonNode jsonNode; | ||
try { | ||
jsonNode = JsonUtils.stringToJsonNode(jsonString); | ||
} catch (IOException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why limit to IOException? Can we catch a Throwable
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the actual exception here we supposed to cathc is malform json exception only. all other issues should still throw --> e..g we should not return incorrect result or simply replace that with the default when there are other issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation @walterddr ,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ankitsultana @walterddr
I change the source catch JsonProcessingException in the most recent commit because the stringToJsonNode internal invoke the DEFAULT_READER.readTree() which throws JsonProcessingException only.
JsonUtils.stringToJsonNode(...) throws IOException {
DEFAULT_READER.readTree(...);
}
class ObjectReader {
public JsonNode readTree(String json) throws JsonProcessingException, JsonMappingException;
}
JsonMappingException is a child of JsonProcessingException.
pinot-spi/src/main/java/org/apache/pinot/spi/utils/JsonUtils.java
Outdated
Show resolved
Hide resolved
JsonNode jsonNode; | ||
try { | ||
jsonNode = JsonUtils.stringToJsonNode(jsonString); | ||
} catch (IOException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the actual exception here we supposed to cathc is malform json exception only. all other issues should still throw --> e..g we should not return incorrect result or simply replace that with the default when there are other issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm mostly. one more comment on the testing.
@Test | ||
public void testSkipInvalidJson() throws Exception { | ||
JsonIndexConfig jsonIndexConfig = new JsonIndexConfig(); | ||
jsonIndexConfig.setSkipInvalidJson(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you
- add a test when it is false it still throws?
- add a test for other exception is still throws when this is set to true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you
- add a test when it is false it still throws?
- add a test for other exception is still throws when this is set to true?
for the 2nd requirement, I haven't figure a way to test since the JsonUtils make the objectReader as a static final member and I cannot replace a static final member without changing the constructor signature or not using more powerful mock libraray.
public static final ObjectReader DEFAULT_READER = DEFAULT_MAPPER.reader();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the late reply. that fine if we can't find it.
Deployed this change to the company's prod cluster with following config. The ERROR segments containing invalid json rows become ONLINE.
|
...-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/JsonIndexTest.java
Outdated
Show resolved
Hide resolved
...-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/JsonIndexTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies. Had stamped but had a question. @wirybeaver can you take a look?
no worries |
@walterddr Could you review again to see if there's any objection? |
Merged since very minor changes since Rong's last review. (Rong is OOO for a week) |
Add a new configuration to skip broken json string and continue indexing following json strings instead of throw exception.