Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: unify many_one and many_many storage #2912

Merged
merged 2 commits into from
Mar 3, 2024

Conversation

ray6080
Copy link
Contributor

@ray6080 ray6080 commented Feb 19, 2024

Remove the special storage layout for MANY/ONE_ONE rel tables. Before this PR, we store columns in the fwd direction of a MANY/ONE_ONE table or bwd direction of a ONE/ONE_MANY table as same as columns in a node table. That is to say, data in those columns are not stored in the CSR layout.
This was introduced for two reasons: 1) as a special performance optimization that we will not factorize the scan output of these columns, instead go for vectorization, as the max degree is only 1; 2) as a usability feature to enforce the degree constraint.
The downside of this special optimization is that it complicates the code path in storage and our scan operators. It's good to keep it as a usability feature, and we should revisit the layout to generalize the performance optimization to all rel tables, regardless of their rel multiplicity, as scans on small degree MANY_MANY tables can also suffer from unnecessary factorization when we should switch to full vectorization.

Note: this PR introduces performance degrades on MANY/ONE_ONE tables. See #2988. A temporary solution for the degrades, which adds a special path for scanning over MANY/ONE_ONE rel tables in CSR format can possibly be introduced if the performance issue is critical, though prefer not to do it now as we should solve it alongside #2988.

Copy link

codecov bot commented Feb 20, 2024

Codecov Report

Attention: Patch coverage is 91.85185% with 55 lines in your changes are missing coverage. Please review.

Project coverage is 93.30%. Comparing base (df5ec1c) to head (2f8ab72).

Files Patch % Lines
src/storage/store/rel_table_data.cpp 91.65% 42 Missing ⚠️
src/storage/store/rel_table.cpp 83.33% 3 Missing ⚠️
src/common/enums/rel_multiplicity.cpp 80.00% 2 Missing ⚠️
src/common/exception/message.cpp 50.00% 2 Missing ⚠️
src/include/storage/store/rel_table_data.h 90.00% 2 Missing ⚠️
src/processor/operator/persistent/copy_rel.cpp 92.30% 2 Missing ⚠️
src/storage/local_storage/local_rel_table.cpp 91.30% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2912      +/-   ##
==========================================
- Coverage   93.52%   93.30%   -0.22%     
==========================================
  Files        1125     1121       -4     
  Lines       43090    42863     -227     
==========================================
- Hits        40300    39995     -305     
- Misses       2790     2868      +78     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ray6080 ray6080 force-pushed the refactor-many-one-table branch 2 times, most recently from 9ff7bd9 to 1807229 Compare February 21, 2024 14:14
@ray6080 ray6080 marked this pull request as ready for review February 21, 2024 14:14
@ray6080 ray6080 force-pushed the refactor-many-one-table branch 2 times, most recently from 4b76d5b to 063ed1e Compare March 3, 2024 14:53
@ray6080 ray6080 changed the title Unify many_one and many_many storage Refactor: unify many_one and many_many storage Mar 3, 2024
@ray6080 ray6080 merged commit f9e9e29 into master Mar 3, 2024
15 checks passed
@ray6080 ray6080 deleted the refactor-many-one-table branch March 3, 2024 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants