Refactor ParquetNativeRecordReader to have all the types to support #9352

xiangfu0 · 2022-09-09T02:10:22Z

Restructure the types to implement for Parquet data type converter
Adding more test cases for parquet files

codecov-commenter · 2022-09-09T03:53:39Z

Codecov Report

Merging #9352 (b5cf587) into master (0f4bcfc) will decrease coverage by 43.66%.
The diff coverage is 0.00%.

❗ Current head b5cf587 differs from pull request most recent head 1d3e335. Consider uploading reports for the commit 1d3e335 to get more accurate results

@@              Coverage Diff              @@
##             master    #9352       +/-   ##
=============================================
- Coverage     69.80%   26.13%   -43.67%     
+ Complexity     4777       44     -4733     
=============================================
  Files          1875     1872        -3     
  Lines         99860    99914       +54     
  Branches      15194    15212       +18     
=============================================
- Hits          69706    26115    -43591     
- Misses        25231    71180    +45949     
+ Partials       4923     2619     -2304

Flag	Coverage Δ
integration1	`26.13% <0.00%> (-0.07%)`	⬇️
integration2	`?`
unittests1	`?`
unittests2	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...utformat/parquet/ParquetNativeRecordExtractor.java	`0.00% <0.00%> (-68.00%)`	⬇️
...in/java/org/apache/pinot/spi/utils/BytesUtils.java	`0.00% <0.00%> (-100.00%)`	⬇️
.../java/org/apache/pinot/spi/utils/BooleanUtils.java	`0.00% <0.00%> (-100.00%)`	⬇️
...java/org/apache/pinot/spi/trace/BaseRecording.java	`0.00% <0.00%> (-100.00%)`	⬇️
...java/org/apache/pinot/spi/trace/NoOpRecording.java	`0.00% <0.00%> (-100.00%)`	⬇️
...ava/org/apache/pinot/spi/config/table/FSTType.java	`0.00% <0.00%> (-100.00%)`	⬇️
...ava/org/apache/pinot/spi/config/user/RoleType.java	`0.00% <0.00%> (-100.00%)`	⬇️
...ava/org/apache/pinot/spi/data/MetricFieldSpec.java	`0.00% <0.00%> (-100.00%)`	⬇️
...ava/org/apache/pinot/spi/utils/NullValueUtils.java	`0.00% <0.00%> (-100.00%)`	⬇️
...java/org/apache/pinot/common/tier/TierFactory.java	`0.00% <0.00%> (-100.00%)`	⬇️
... and 1376 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Jackie-Jiang

Do we need to check in so many test files? Having a test file with all the field type should be enough?

xiangfu0 · 2022-09-09T21:16:37Z

Do we need to check in so many test files? Having a test file with all the field type should be enough?

Those are various small parquet files from:
https://github.com/apache/parquet-testing/tree/master/data
https://github.com/dask/fastparquet/tree/main/test-data

I copied over them and wanna make sure we pass all of them until we found a better test plan.

We still have files don't support yet e.g. compression lib missing or encryption methods etc.
I commented out those files in the tests, the goal is to uncomment them all.

xiangfu0 requested review from Jackie-Jiang, kishoreg and KKcorps September 9, 2022 02:10

xiangfu0 force-pushed the fixing_parquet_file branch 2 times, most recently from b5cf587 to 1d3e335 Compare September 9, 2022 03:44

xiangfu0 force-pushed the fixing_parquet_file branch 3 times, most recently from 3ec277f to ab614b5 Compare September 9, 2022 04:20

Jackie-Jiang reviewed Sep 9, 2022

View reviewed changes

xiangfu0 requested review from Jackie-Jiang and walterddr September 9, 2022 21:16

Jackie-Jiang approved these changes Sep 9, 2022

View reviewed changes

Adding more test cases for parquet files

4abbb9c

xiangfu0 force-pushed the fixing_parquet_file branch from ab614b5 to 4abbb9c Compare September 9, 2022 21:35

xiangfu0 merged commit e3939a4 into apache:master Sep 9, 2022

xiangfu0 deleted the fixing_parquet_file branch September 9, 2022 23:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor ParquetNativeRecordReader to have all the types to support #9352

Refactor ParquetNativeRecordReader to have all the types to support #9352

xiangfu0 commented Sep 9, 2022

codecov-commenter commented Sep 9, 2022

Jackie-Jiang left a comment

xiangfu0 commented Sep 9, 2022

Refactor ParquetNativeRecordReader to have all the types to support #9352

Refactor ParquetNativeRecordReader to have all the types to support #9352

Conversation

xiangfu0 commented Sep 9, 2022

codecov-commenter commented Sep 9, 2022

Codecov Report

Jackie-Jiang left a comment

Choose a reason for hiding this comment

xiangfu0 commented Sep 9, 2022