-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the bug of reading decimal value stored in int32 or int64 in ParquetNativeRecordReader #11840
Conversation
Codecov Report
@@ Coverage Diff @@
## master #11840 +/- ##
============================================
+ Coverage 63.03% 63.06% +0.02%
Complexity 1141 1141
============================================
Files 2350 2350
Lines 127284 127293 +9
Branches 19604 19607 +3
============================================
+ Hits 80236 80279 +43
+ Misses 41339 41300 -39
- Partials 5709 5714 +5
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 12 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's easy, can we add the case that would cover to test decimal stored in int?
LGTM otherwise.
.../src/main/java/org/apache/pinot/plugin/inputformat/parquet/ParquetNativeRecordExtractor.java
Show resolved
Hide resolved
.../src/main/java/org/apache/pinot/plugin/inputformat/parquet/ParquetNativeRecordExtractor.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making this change. @xiangfu0 - How are we verifying this change ?
6e184b1
to
7550f7b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
.../src/main/java/org/apache/pinot/plugin/inputformat/parquet/ParquetNativeRecordExtractor.java
Show resolved
Hide resolved
d8ea5ce
to
4e4d152
Compare
Added a test file to test logical types and just one simple test for decimal data type. |
4e4d152
to
a25701b
Compare
In Apache Parquet, decimal logical types can be represented by using either the
INT32
,INT64
, orFIXED_LEN_BYTE_ARRAY
primitive types, depending on the precision needed for the decimal numbers.The logical type specification includes two components: precision and scale.
Precision
: The total number of digits that the decimal number can have.Scale
: The number of digits that can appear after the decimal point.Here's how these map:
INT32
: Suitable for decimal types that require up to 9 digits of precision.INT64
: Suitable for decimal types that require up to 18 digits of precision.FIXED_LEN_BYTE_ARRAY
: Suitable for decimal types that require more than 18 digits of precision. The length of the byte array would depend on the required precision.More Ref can be found here: https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal
This PR:
OriginalType
toLogicalTypeAnnotation