Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor ParquetNativeRecordReader to have all the types to support #9352

Merged
merged 1 commit into from
Sep 9, 2022

Conversation

xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Sep 9, 2022

  • Restructure the types to implement for Parquet data type converter
  • Adding more test cases for parquet files

@xiangfu0 xiangfu0 force-pushed the fixing_parquet_file branch 2 times, most recently from b5cf587 to 1d3e335 Compare September 9, 2022 03:44
@codecov-commenter
Copy link

Codecov Report

Merging #9352 (b5cf587) into master (0f4bcfc) will decrease coverage by 43.66%.
The diff coverage is 0.00%.

❗ Current head b5cf587 differs from pull request most recent head 1d3e335. Consider uploading reports for the commit 1d3e335 to get more accurate results

@@              Coverage Diff              @@
##             master    #9352       +/-   ##
=============================================
- Coverage     69.80%   26.13%   -43.67%     
+ Complexity     4777       44     -4733     
=============================================
  Files          1875     1872        -3     
  Lines         99860    99914       +54     
  Branches      15194    15212       +18     
=============================================
- Hits          69706    26115    -43591     
- Misses        25231    71180    +45949     
+ Partials       4923     2619     -2304     
Flag Coverage Δ
integration1 26.13% <0.00%> (-0.07%) ⬇️
integration2 ?
unittests1 ?
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...utformat/parquet/ParquetNativeRecordExtractor.java 0.00% <0.00%> (-68.00%) ⬇️
...in/java/org/apache/pinot/spi/utils/BytesUtils.java 0.00% <0.00%> (-100.00%) ⬇️
.../java/org/apache/pinot/spi/utils/BooleanUtils.java 0.00% <0.00%> (-100.00%) ⬇️
...java/org/apache/pinot/spi/trace/BaseRecording.java 0.00% <0.00%> (-100.00%) ⬇️
...java/org/apache/pinot/spi/trace/NoOpRecording.java 0.00% <0.00%> (-100.00%) ⬇️
...ava/org/apache/pinot/spi/config/table/FSTType.java 0.00% <0.00%> (-100.00%) ⬇️
...ava/org/apache/pinot/spi/config/user/RoleType.java 0.00% <0.00%> (-100.00%) ⬇️
...ava/org/apache/pinot/spi/data/MetricFieldSpec.java 0.00% <0.00%> (-100.00%) ⬇️
...ava/org/apache/pinot/spi/utils/NullValueUtils.java 0.00% <0.00%> (-100.00%) ⬇️
...java/org/apache/pinot/common/tier/TierFactory.java 0.00% <0.00%> (-100.00%) ⬇️
... and 1376 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@xiangfu0 xiangfu0 force-pushed the fixing_parquet_file branch 3 times, most recently from 3ec277f to ab614b5 Compare September 9, 2022 04:20
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check in so many test files? Having a test file with all the field type should be enough?

@xiangfu0
Copy link
Contributor Author

xiangfu0 commented Sep 9, 2022

Do we need to check in so many test files? Having a test file with all the field type should be enough?

Those are various small parquet files from:
https://github.com/apache/parquet-testing/tree/master/data
https://github.com/dask/fastparquet/tree/main/test-data

I copied over them and wanna make sure we pass all of them until we found a better test plan.

We still have files don't support yet e.g. compression lib missing or encryption methods etc.
I commented out those files in the tests, the goal is to uncomment them all.

@xiangfu0 xiangfu0 merged commit e3939a4 into apache:master Sep 9, 2022
@xiangfu0 xiangfu0 deleted the fixing_parquet_file branch September 9, 2022 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants