Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding byte functions for UUIDs #11988

Merged

Conversation

mgranderath
Copy link
Contributor

@mgranderath mgranderath commented Nov 10, 2023

We use UUIDs as identifiers in our data that we ingest into Pinot and we noticed that these take up quite a lot of space because they can't easily be compressed in their String representation. Converting them to bytes, however, results in about 30% storage savings.

This adds two new scalar functions for dealing with UUIDs:

  • toUUIDBytes: turns a String representation of a UUID to bytes
  • fromUUIDBytes: turns a byte representation of a UUID back to a String

Thanks for the help of @kishoreg for investigating this

@codecov-commenter
Copy link

codecov-commenter commented Nov 10, 2023

Codecov Report

Merging #11988 (baf160f) into master (2beb9a4) will decrease coverage by 26.73%.
The diff coverage is 0.00%.

@@              Coverage Diff              @@
##             master   #11988       +/-   ##
=============================================
- Coverage     61.61%   34.89%   -26.73%     
- Complexity      207      927      +720     
=============================================
  Files          2385     2309       -76     
  Lines        129214   125477     -3737     
  Branches      20003    19445      -558     
=============================================
- Hits          79613    43779    -35834     
- Misses        43801    78584    +34783     
+ Partials       5800     3114     -2686     
Flag Coverage Δ
custom-integration1 ?
integration 0.00% <0.00%> (-0.01%) ⬇️
integration1 ?
integration2 0.00% <0.00%> (ø)
java-11 ?
java-21 34.89% <0.00%> (-26.59%) ⬇️
skip-bytebuffers-false 34.89% <0.00%> (-26.73%) ⬇️
skip-bytebuffers-true 0.00% <0.00%> (-27.59%) ⬇️
temurin 34.89% <0.00%> (-26.73%) ⬇️
unittests 46.74% <0.00%> (-14.86%) ⬇️
unittests1 46.74% <0.00%> (-0.20%) ⬇️
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
.../pinot/common/function/scalar/StringFunctions.java 60.46% <0.00%> (-5.64%) ⬇️

... and 843 files with indirect coverage changes

📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today!

@Jackie-Jiang Jackie-Jiang added the release-notes Referenced by PRs that need attention when compiling the next release notes label Nov 14, 2023
@Jackie-Jiang Jackie-Jiang merged commit 53883ae into apache:master Nov 14, 2023
15 of 17 checks passed
@mgranderath mgranderath deleted the mgranderath/uuid-string-functions branch November 16, 2023 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes Referenced by PRs that need attention when compiling the next release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants