Unnecessary Scans/Reads -> Extra I/O #112

queryproc · 2021-04-10T17:01:47Z

Consider the following query:

MATCH (a)-e1->(b)
RETURN e1.time;

Currently, we consider two possible QVOs [a, b] and therefore generate two plans:

P1: Scan(a) → Extend(b) → ScanEdgeProperty(e1.time) → Project([e1.time])

OR

P2: Scan(b) → Extend(a) → ScanEdgeProperty(e1.time) → Project([e1.time])

The extension to (b) comes from the fact that we tend to cover all of the query edges and vertices blindly regardless of what the query actually requests. The two plans should actually be:

P1: Scan(a) → ScanEdgeProperty(e1.time) → Project([e1.time])

OR

P2: Scan(b) → ScanEdgeProperty(e1.time) → Project([e1.time])

In other cases, where we don’t actually need any properties all together, we should simply get the size from the Lists header and therefore not pin/unpin unnecessarily. The query:

MATCH (a)-e1->(b)
RETURN count(*);

Should generate:

P1: Scan(a) → GetListSize(e1) → GROUP BY (COUNT(*))
P1: Scan(b) → GetListSize(e1) → GROUP BY (COUNT(*))

We need to be more aware of the actual properties needed. ID is just another property we do a join on.

The text was updated successfully, but these errors were encountered:

andyfengHKU · 2023-03-01T21:38:57Z

Solved in PR #1329

queryproc added the performance optimization label Apr 18, 2021

ray6080 mentioned this issue Jul 22, 2021

List of optimizations for Stage 1 #283

Closed

15 tasks

andyfengHKU closed this as completed Mar 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unnecessary Scans/Reads -> Extra I/O #112

Unnecessary Scans/Reads -> Extra I/O #112

queryproc commented Apr 10, 2021

andyfengHKU commented Mar 1, 2023

Unnecessary Scans/Reads -> Extra I/O #112

Unnecessary Scans/Reads -> Extra I/O #112

Comments

queryproc commented Apr 10, 2021

andyfengHKU commented Mar 1, 2023