Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add graph pattern label pruning #2263

Merged
merged 1 commit into from
Oct 27, 2023
Merged

Conversation

andyfengHKU
Copy link
Contributor

@andyfengHKU andyfengHKU commented Oct 25, 2023

This PR replace our previous naive rel pattern label pruning with a new pruning class that prune both node and relationship labels based on topology.

Primary use case

In RDF processing, since a join can only happen on resource node, the following query

MATCH (a)-[e1]->(b)-[e2]->(c)

can be pruned as

MATCH (a:resource)-[e1:resource_triple]->(b:resource)-[e2:resource_triple|literal_triple]->(c:resource|literal)

The join over e1,b will benefit from pruning.

And of course general multi label query will also benefit from this PR.

Minutia

Technically speaking, label pruning should be performed iteratively until the algorithm converge. Though I don't think it make much difference in practice, plus we want the compilation to be fast. So the current implementation only perform pruning once over each node and rel.

@codecov
Copy link

codecov bot commented Oct 25, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (fe021e9) 89.63% compared to head (73f8f9b) 89.74%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2263      +/-   ##
==========================================
+ Coverage   89.63%   89.74%   +0.10%     
==========================================
  Files        1024     1027       +3     
  Lines       36079    36151      +72     
==========================================
+ Hits        32339    32443     +104     
+ Misses       3740     3708      -32     
Files Coverage Δ
src/binder/bind/bind_graph_pattern.cpp 96.67% <ø> (+0.45%) ⬆️
src/binder/binder.cpp 97.70% <100.00%> (ø)
src/binder/bound_statement_rewriter.cpp 100.00% <100.00%> (ø)
...r/rewriter/match_clause_pattern_label_rewriter.cpp 100.00% <100.00%> (ø)
...rc/include/binder/expression/node_rel_expression.h 100.00% <100.00%> (ø)
src/include/binder/expression/rel_expression.h 100.00% <ø> (ø)
...der/rewriter/match_clause_pattern_label_rewriter.h 100.00% <100.00%> (ø)
src/processor/map/map_scan_node_property.cpp 100.00% <100.00%> (ø)
src/processor/map/map_set.cpp 100.00% <100.00%> (ø)
src/processor/operator/persistent/set_executor.cpp 97.82% <100.00%> (+0.23%) ⬆️
... and 2 more

... and 7 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/binder/query/query_graph_label_analyzer.cpp Outdated Show resolved Hide resolved
@andyfengHKU andyfengHKU merged commit ece824f into master Oct 27, 2023
12 checks passed
@andyfengHKU andyfengHKU deleted the query-graph-label-pruning branch October 27, 2023 05:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants