Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No dictionary group by perf #8195

Merged

Conversation

richardstartin
Copy link
Member

@richardstartin richardstartin commented Feb 11, 2022

This change is motivated by slow queries at one of our customers which group by a raw column, where 30GB was seen to be allocated by NoDictionaryMultiColumnGroupKeyGenerator.generateKeyForBlock, which is also where most of the method samples were taken:
Screenshot 2022-02-11 at 18 33 25
Screenshot 2022-02-11 at 18 35 21

This PR starts by generalising one of our pre-existing benchmarks which does a good job of exercising the entire query execution. It is parameterised so different queries can be added easily, and the generated data is parameterised too so that columns with different cardinalities can be created.

Then, the actual improvement is made in the second commit. It transposes the group key generation since the BlockValSets will be cached by DataBlockCache anyway, then accumulates keys into a flyweight, which only needs to be allocated to memoize the group key on its first occurrence. This roughly halves average time and reduces allocation by at least a factor of 4:

before:

Benchmark                                                (_numRows)                                                                       (_query)  (_scenario)  Mode  Cnt          Score           Error   Units
BenchmarkQueries.query                                      1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5        200.573 ±        36.577   ms/op
BenchmarkQueries.query:·gc.alloc.rate                       1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5        454.459 ±       985.590  MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5  139180218.880 ± 299249329.376    B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space              1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5        510.589 ±       414.979  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm         1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5  156957846.187 ± 109973654.666    B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen                 1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5         12.236 ±        42.494  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm            1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5    3732222.293 ±  12807297.981    B/op
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space          1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5          4.412 ±        19.484  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space.norm     1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5    1398101.333 ±   6240670.451    B/op
BenchmarkQueries.query:·gc.count                            1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5          8.000                  counts
BenchmarkQueries.query:·gc.time                             1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5        407.000                      ms
BenchmarkQueries.query                                      1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5         98.663 ±         7.845   ms/op
BenchmarkQueries.query:·gc.alloc.rate                       1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5        696.114 ±      1498.561  MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5  106449429.440 ± 228808237.226    B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space              1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5       1029.174 ±      2217.348  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm         1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5  157963208.145 ± 341415795.326    B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen                 1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5          0.109 ±         0.714  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm            1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5      16481.891 ±    107743.719    B/op
BenchmarkQueries.query:·gc.count                            1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5          4.000                  counts
BenchmarkQueries.query:·gc.time                             1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5         10.000                      ms
BenchmarkQueries.query                                      1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5         90.816 ±         8.115   ms/op
BenchmarkQueries.query:·gc.alloc.rate                       1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5        752.671 ±      1621.248  MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5  106393285.309 ± 228688298.313    B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space              1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5       1022.984 ±      2203.794  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm         1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5  143076606.448 ± 309117349.271    B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen                 1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5          0.150 ±         0.832  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm            1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5      21384.158 ±    119911.555    B/op
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space          1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5          0.126 ±         1.082  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space.norm     1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5      17476.267 ±    150475.927    B/op
BenchmarkQueries.query:·gc.count                            1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5          4.000                  counts
BenchmarkQueries.query:·gc.time                             1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5         52.000                      ms

after:

Benchmark                                                (_numRows)                                                                       (_query)  (_scenario)  Mode  Cnt         Score          Error   Units
BenchmarkQueries.query                                      1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5       130.071 ±        5.744   ms/op
BenchmarkQueries.query:·gc.alloc.rate                       1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5       197.775 ±      424.170  MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5  39989639.600 ± 85754314.959    B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space              1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5       210.675 ±      161.001  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm         1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5  42677043.200 ± 33405655.687    B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen                 1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5         0.835 ±        6.634  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm            1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5    167488.000 ±  1331249.740    B/op
BenchmarkQueries.query:·gc.count                            1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5        11.000                 counts
BenchmarkQueries.query:·gc.time                             1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5       268.000                     ms
BenchmarkQueries.query                                      1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5        54.864 ±        4.432   ms/op
BenchmarkQueries.query:·gc.alloc.rate                       1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5        10.390 ±       18.671  MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5    883504.473 ±  1574108.609    B/op
BenchmarkQueries.query:·gc.count                            1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5           ≈ 0                 counts
BenchmarkQueries.query                                      1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5        47.429 ±        2.367   ms/op
BenchmarkQueries.query:·gc.alloc.rate                       1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5        10.961 ±       19.191  MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5    811307.408 ±  1420111.422    B/op
BenchmarkQueries.query:·gc.count                            1500000  SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5           ≈ 0                 counts

@codecov-commenter
Copy link

codecov-commenter commented Feb 11, 2022

Codecov Report

Merging #8195 (9936d55) into master (bad7106) will decrease coverage by 1.25%.
The diff coverage is 94.44%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #8195      +/-   ##
============================================
- Coverage     71.33%   70.07%   -1.26%     
- Complexity     4308     4314       +6     
============================================
  Files          1623     1624       +1     
  Lines         84365    84883     +518     
  Branches      12657    12794     +137     
============================================
- Hits          60183    59485     -698     
- Misses        20050    21286    +1236     
+ Partials       4132     4112      -20     
Flag Coverage Δ
integration1 28.73% <66.66%> (-0.11%) ⬇️
integration2 ?
unittests1 67.46% <94.44%> (-0.41%) ⬇️
unittests2 14.14% <0.00%> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...upby/NoDictionaryMultiColumnGroupKeyGenerator.java 63.55% <94.28%> (+0.62%) ⬆️
...java/org/apache/pinot/spi/utils/FixedIntArray.java 61.53% <100.00%> (+3.20%) ⬆️
...t/core/plan/StreamingInstanceResponsePlanNode.java 0.00% <0.00%> (-100.00%) ⬇️
...ore/operator/streaming/StreamingResponseUtils.java 0.00% <0.00%> (-100.00%) ⬇️
...ager/realtime/PeerSchemeSplitSegmentCommitter.java 0.00% <0.00%> (-100.00%) ⬇️
...pache/pinot/common/utils/grpc/GrpcQueryClient.java 0.00% <0.00%> (-94.74%) ⬇️
...he/pinot/core/plan/StreamingSelectionPlanNode.java 0.00% <0.00%> (-88.89%) ⬇️
...ator/streaming/StreamingSelectionOnlyOperator.java 0.00% <0.00%> (-87.81%) ⬇️
...re/query/reduce/SelectionOnlyStreamingReducer.java 0.00% <0.00%> (-85.72%) ⬇️
...oker/broker/BrokerServiceAutoDiscoveryFeature.java 0.00% <0.00%> (-81.82%) ⬇️
... and 103 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bad7106...9936d55. Read the comment docs.

@richardstartin richardstartin force-pushed the no-dictionary-group-by-perf branch 2 times, most recently from 6db76af to 36c68fc Compare February 14, 2022 11:45
@richardstartin
Copy link
Member Author

Added another benchmark with a group by on a raw string column. The effect is mostly drowned out by excessive String allocation (no interning 😢), but this branch still offers a significant improvement. Note that the majority of the branches are pruned by tiered compilation and are entirely predictable.

Before

Benchmark                                                (_numRows)                                                                                                    (_query)  (_scenario)  Mode  Cnt          Score            Error   Units
BenchmarkQueries.query                                      1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5        401.449 ±         40.560   ms/op
BenchmarkQueries.query:·gc.alloc.rate                       1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5        594.902 ±       1280.361  MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5  352657742.400 ±  758730386.850    B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space              1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5        816.002 ±       1783.388  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm         1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5  490244232.533 ± 1075546345.417    B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen                 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5         13.856 ±         50.128  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm            1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5    8247320.000 ±   29878654.316    B/op
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space          1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5          1.813 ±         15.611  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space.norm     1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5    1118481.067 ±    9630459.297    B/op
BenchmarkQueries.query:·gc.count                            1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5          4.000                   counts
BenchmarkQueries.query:·gc.time                             1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5        259.000                       ms
BenchmarkQueries.query                                      1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5        219.553 ±          2.789   ms/op
BenchmarkQueries.query:·gc.alloc.rate                       1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5        947.525 ±       2038.543  MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5  318633695.680 ±  685512352.413    B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space              1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5       1342.334 ±       1944.947  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm         1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5  451726540.800 ±  656852332.622    B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen                 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5          0.056 ±          0.383  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm            1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5      18864.320 ±     128684.949    B/op
BenchmarkQueries.query:·gc.count                            1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5          6.000                   counts
BenchmarkQueries.query:·gc.time                             1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5         20.000                       ms
BenchmarkQueries.query                                      1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5        211.974 ±          4.234   ms/op
BenchmarkQueries.query:·gc.alloc.rate                       1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5        970.031 ±       2086.994  MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5  318102240.640 ±  684372243.782    B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space              1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5       1127.183 ±        696.552  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm         1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5  370147328.000 ±  229406341.423    B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen                 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5          0.175 ±          0.974  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm            1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5      57655.680 ±     321478.066    B/op
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space          1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5          0.255 ±          2.194  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space.norm     1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5      83886.080 ±     722284.447    B/op
BenchmarkQueries.query:·gc.count                            1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5          6.000                   counts
BenchmarkQueries.query:·gc.time                             1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5         14.000                       ms

After

Benchmark                                                (_numRows)                                                                                                    (_query)  (_scenario)  Mode  Cnt          Score           Error   Units
BenchmarkQueries.query                                      1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5        347.712 ±        27.737   ms/op
BenchmarkQueries.query:·gc.alloc.rate                       1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5        434.512 ±       935.011  MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5  235552355.733 ± 506651891.452    B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space              1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5        446.016 ±      1000.169  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm         1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5  241731720.533 ± 541560831.721    B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen                 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5         11.443 ±        44.382  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm            1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5    6151645.867 ±  23774788.552    B/op
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space          1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5          1.013 ±         8.723  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space.norm     1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5     559240.533 ±   4815229.649    B/op
BenchmarkQueries.query:·gc.count                            1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5          4.000                  counts
BenchmarkQueries.query:·gc.time                             1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.001)  avgt    5        129.000                      ms
BenchmarkQueries.query                                      1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5        185.713 ±         4.709   ms/op
BenchmarkQueries.query:·gc.alloc.rate                       1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5        684.654 ±      1472.475  MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5  194064742.933 ± 417372573.618    B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space              1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5        742.663 ±      2610.572  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm         1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5  210449203.200 ± 739758601.746    B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen                 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5          0.022 ±         0.186  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm            1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5       6095.467 ±     52483.806    B/op
BenchmarkQueries.query:·gc.count                            1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5          3.000                  counts
BenchmarkQueries.query:·gc.time                             1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL     EXP(0.5)  avgt    5          7.000                      ms
BenchmarkQueries.query                                      1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5        183.992 ±         7.627   ms/op
BenchmarkQueries.query:·gc.alloc.rate                       1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5        686.789 ±      1477.255  MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5  193360184.800 ± 415859888.283    B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space              1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5        919.579 ±      1979.891  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm         1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5  259487607.467 ± 558628861.127    B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen                 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5          0.032 ±         0.190  MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm            1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5       8994.933 ±     53455.434    B/op
BenchmarkQueries.query:·gc.count                            1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5          4.000                  counts
BenchmarkQueries.query:·gc.time                             1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL   EXP(0.999)  avgt    5         22.000                      ms

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR definitely improves the performance by reducing the unnecessary allocations. My concern is about whether adding the branches can hurt the performance comparing to reusing the keys (even though the overall performance is still much better because of the reduced allocations). If we are confident that JVM can optimize the branches so that the overhead is negligible, then it is good to go.

@richardstartin
Copy link
Member Author

This PR definitely improves the performance by reducing the unnecessary allocations. My concern is about whether adding the branches can hurt the performance comparing to reusing the keys (even though the overall performance is still much better because of the reduced allocations). If we are confident that JVM can optimize the branches so that the overhead is negligible, then it is good to go.

I'm not too concerned about the branches that don't get pruned because the they are predictable (they go the same way for the entire block) but I think we could have the best of both worlds with code generation - if we can generate a tuple and code to lazily materialise the tuple for each shape we see in queries (of which I doubt there would be more than 20 patterns in a realistic deployment) we can have both low memory footprint and no conditionals.

@richardstartin richardstartin merged commit 7678019 into apache:master Feb 15, 2022
xiangfu0 pushed a commit to xiangfu0/pinot that referenced this pull request Feb 23, 2022
* generalise benchmark, add raw sum and raw group by

* memory efficiency in no dictionary multi group by
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants