-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix partition handling by always using string values #12115
Fix partition handling by always using string values #12115
Conversation
211499e
to
04f71c5
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #12115 +/- ##
============================================
- Coverage 61.61% 61.61% -0.01%
Complexity 1153 1153
============================================
Files 2407 2407
Lines 130922 130871 -51
Branches 20223 20222 -1
============================================
- Hits 80669 80632 -37
+ Misses 44366 44351 -15
- Partials 5887 5888 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
return String.valueOf(_partitionFunction.getPartition(convertToString(genericRow.getValue(_column)))); | ||
} | ||
|
||
private static String convertToString(Object value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to catch unsupported types here to ensure that for new types, we do the string conversion correctly ? Same for the logic in MutableSegmentImpl.java (similar to what you have in getStringValue utility)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can potentially check for all possible classes supported (e.g. Integer, Long, Float, Double, String etc.), but that might involve performance overhead (one if check per type). Since we don't add new data types frequently, I prefer keeping the check only for types that need to be converted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok sounds good. I assume that if a new type is introduced, we anyway have to modify RequestContextUtils to add the new literal type? (where the exception is thrown). if thats the case, we can consider moving this logic and the one in MutableSegmentImpl.java to RequestContextUtils, so that they are in the same file and easy to discover all the places that needs the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Moved them into FieldSpec
where the DataType
is defined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM . Thanks for doing this in a generic way.
return String.valueOf(_partitionFunction.getPartition(convertToString(genericRow.getValue(_column)))); | ||
} | ||
|
||
private static String convertToString(Object value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok sounds good. I assume that if a new type is introduced, we anyway have to modify RequestContextUtils to add the new literal type? (where the exception is thrown). if thats the case, we can consider moving this logic and the one in MutableSegmentImpl.java to RequestContextUtils, so that they are in the same file and easy to discover all the places that needs the change.
04f71c5
to
14567e5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LTGM!
Hi @Jackie-Jiang @snleee @swaminathanmanish , thanks for making the code changes in this PR! We really appreciate the contribution here. While we do have some concerns on making the backward incompatible changes not just for this PR (i.e. changing the signature of a public API) but for all the future PRs, as any incompatible changes could easily impact the existing running Pinot use cases and requires admins to verify, not just for 1 cluster in 1 company but for all the Pinot clusters run in different places/companies. If possible, can we do the following if the backward incompatibility is inevitable in the future PR:
By following this steps we can at least have someone to keep an eye on the incompatible changes from multiple stakeholders and have consensus promptly instead of merging it silently. Again, we really appreciate the contribution here and look forward to building the great milestones to come together! Thanks! |
@jackjlli Thanks for the suggestion! Going forward, will add a section to list all the public API changes in the PR description. |
Always use string representation of the value to compute the partition so that the value from query can always match the value from the segment.
Fix the wrong handling of
BYTES
andBIG_DECIMAL
type as partition column