perf: improve GetExperiments + SearchExperiments counting #8801

NicholasBlaskey · 2024-02-05T20:43:19Z

Description

GetExperiments + SearchExperiments take a long time to count the experiments.

I think partly the introduction of the trials view really messed with the optimizer on this.

Test Plan

integration tests

Commentary (optional)

Checklist

Changes have been manually QA'd
User-facing API changes need the "User-facing API Change" label.
Release notes should be added as a separate file under docs/release-notes/.
See Release Note for details.
Licenses should be included for new code which was copied and/or modified from any external code.

Ticket

netlify · 2024-02-05T20:43:24Z

✅ Deploy Preview for determined-ui canceled.

Name	Link
🔨 Latest commit	`5a87208`
🔍 Latest deploy log	https://app.netlify.com/sites/determined-ui/deploys/65cb88aa1bb77f0007283cf4

codecov · 2024-02-05T20:56:59Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (d8d9965) 47.69% compared to head (5a87208) 47.69%.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #8801   +/-   ##
=======================================
  Coverage   47.69%   47.69%           
=======================================
  Files        1065     1065           
  Lines      169612   169611    -1     
  Branches     2240     2238    -2     
=======================================
+ Hits        80901    80903    +2     
+ Misses      88553    88550    -3     
  Partials      158      158

Flag	Coverage Δ
backend	`43.47% <100.00%> (-0.01%)`	⬇️
harness	`64.12% <ø> (+<0.01%)`	⬆️
web	`42.55% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files	Coverage Δ
master/internal/api_experiment.go	`54.76% <100.00%> (-0.03%)`	⬇️
master/internal/experiment_filter.go	`95.91% <100.00%> (ø)`

... and 3 files with indirect coverage changes

NicholasBlaskey · 2024-02-05T21:24:27Z

New pagination count for experiments-search

explain analyze SELECT count(*) FROM experiments as e LEFT JOIN users u ON e.owner_id = u.id LEFT JOIN projects p ON e.project_id = p.id LEFT JOIN workspaces w ON p.workspace_id = w.id LEFT JOIN runs AS r ON r.id = e.best_trial_id
;
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=6538.56..6538.57 rows=1 width=8) (actual time=16.379..16.380 rows=1 loops=1)
   ->  Seq Scan on experiments e  (cost=0.00..6432.05 rows=42605 width=0) (actual time=0.010..14.411 rows=42605 loops=1)
 Planning Time: 0.131 ms
 Execution Time: 16.404 ms
(4 rows)

NicholasBlaskey · 2024-02-05T21:40:43Z

Old pagination count for experiments-search

                                                                                                    QUERY PLAN                                                                                                     
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=37071.25..37071.26 rows=1 width=8) (actual time=386.638..390.547 rows=1 loops=1)                                                                                  
   ->  Gather  (cost=37071.04..37071.25 rows=2 width=8) (actual time=386.630..390.542 rows=3 loops=1)                                                                                        
         Workers Planned: 2                                                                                                                                                                  
         Workers Launched: 2                                                                                                                                                                 
         ->  Partial Aggregate  (cost=36071.04..36071.05 rows=1 width=8) (actual time=375.230..375.240 rows=1 loops=3)                                                                       
               ->  Hash Join  (cost=29420.43..36026.66 rows=17752 width=0) (actual time=327.340..371.486 rows=14202 loops=3)                                                                 
                     Hash Cond: (p.workspace_id = w.id)                                                                                                                                      
                     ->  Hash Join  (cost=29419.36..35939.50 rows=17752 width=4) (actual time=327.215..364.856 rows=14202 loops=3)                                                                                
                           Hash Cond: (e.project_id = p.id)                                                                                                                                                       
                           ->  Hash Join  (cost=29418.25..35852.29 rows=17752 width=4) (actual time=327.189..358.015 rows=14202 loops=3)                                                                          
                                 Hash Cond: (e.owner_id = u.id)                                                                                                                                                                              
                                 ->  Parallel Hash Left Join  (cost=29403.96..35790.56 rows=17752 width=8) (actual time=327.067..351.530 rows=14202 loops=3)                                                                                 
                                       Hash Cond: (e.best_trial_id = t.run_id)                                                                                                                                                               
                                       ->  Parallel Hash Left Join  (cost=14701.98..20987.04 rows=17752 width=12) (actual time=161.228..177.258 rows=14202 loops=3)
                                             Hash Cond: (e.best_trial_id = t_1.run_id)
                                             ->  Parallel Seq Scan on experiments e  (cost=0.00..6183.52 rows=17752 width=12) (actual time=0.007..7.922 rows=14202 loops=3)
                                             ->  Parallel Hash  (cost=12905.18..12905.18 rows=143744 width=4) (actual time=159.716..159.719 rows=81455 loops=3)
                                                   Buckets: 262144  Batches: 1  Memory Usage: 11680kB
                                                   ->  Parallel Hash Join  (cost=8990.40..12905.18 rows=143744 width=4) (actual time=55.842..128.631 rows=81455 loops=3)
                                                         Hash Cond: (t_1.run_id = r_1.id)
                                                         ->  Parallel Seq Scan on trials_v2 t_1  (cost=0.00..3537.44 rows=143744 width=4) (actual time=0.004..12.519 rows=81455 loops=3)
                                                         ->  Parallel Hash  (cost=7725.91..7725.91 rows=101159 width=4) (actual time=55.696..55.697 rows=81455 loops=3)
                                                               Buckets: 262144  Batches: 1  Memory Usage: 11648kB
                                                               ->  Parallel Index Only Scan using trials_pkey on runs r_1  (cost=0.42..7725.91 rows=101159 width=4) (actual time=1.171..22.921 rows=81455 loops=3)
                                                                     Heap Fetches: 0
                                       ->  Parallel Hash  (cost=12905.18..12905.18 rows=143744 width=4) (actual time=165.075..165.077 rows=81455 loops=3)
                                             Buckets: 262144  Batches: 1  Memory Usage: 11680kB
                                             ->  Parallel Hash Join  (cost=8990.40..12905.18 rows=143744 width=4) (actual time=58.237..129.305 rows=81455 loops=3)
                                                   Hash Cond: (t.run_id = r.id)
                                                   ->  Parallel Seq Scan on trials_v2 t  (cost=0.00..3537.44 rows=143744 width=4) (actual time=0.005..11.109 rows=81455 loops=3)
                                                   ->  Parallel Hash  (cost=7725.91..7725.91 rows=101159 width=4) (actual time=56.154..56.154 rows=81455 loops=3)
                                                         Buckets: 262144  Batches: 1  Memory Usage: 11648kB
                                                         ->  Parallel Index Only Scan using trials_pkey on runs r  (cost=0.42..7725.91 rows=101159 width=4) (actual time=0.016..24.312 rows=81455 loops=3)
                                                               Heap Fetches: 0
                                 ->  Hash  (cost=11.35..11.35 rows=235 width=4) (actual time=0.107..0.107 rows=235 loops=3)
                                       Buckets: 1024  Batches: 1  Memory Usage: 17kB
                                       ->  Seq Scan on users u  (cost=0.00..11.35 rows=235 width=4) (actual time=0.008..0.060 rows=235 loops=3)
                           ->  Hash  (cost=1.05..1.05 rows=5 width=8) (actual time=0.011..0.012 rows=5 loops=3)
                                 Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                 ->  Seq Scan on projects p  (cost=0.00..1.05 rows=5 width=8) (actual time=0.007..0.008 rows=5 loops=3)
                     ->  Hash  (cost=1.03..1.03 rows=3 width=4) (actual time=0.020..0.021 rows=3 loops=3)
                           Buckets: 1024  Batches: 1  Memory Usage: 9kB
                           ->  Seq Scan on workspaces w  (cost=0.00..1.03 rows=3 width=4) (actual time=0.011..0.012 rows=3 loops=3)
 Planning Time: 0.944 ms
 Execution Time: 390.619 ms
(45 rows)

This reverts commit 4f874b9.

cla-bot bot added the cla-signed label Feb 5, 2024

NicholasBlaskey marked this pull request as ready for review February 5, 2024 22:27

NicholasBlaskey requested a review from a team as a code owner February 5, 2024 22:27

NicholasBlaskey requested a review from hamidzr February 5, 2024 22:27

hamidzr approved these changes Feb 8, 2024

View reviewed changes

NicholasBlaskey added 3 commits February 13, 2024 10:18

perf: improve GetExperiments + SearchExperiments counting

6d0e0ca

fix

ec3a7f7

Revert "ci: wait longer for performance test db to startup"

3454217

This reverts commit 4f874b9.

NicholasBlaskey force-pushed the perf_improve_get_experiments branch from 82c66ef to 3454217 Compare February 13, 2024 15:18

NicholasBlaskey requested a review from a team as a code owner February 13, 2024 15:18

NicholasBlaskey requested a review from dzhu February 13, 2024 15:18

perf imrpvoe

5a87208

NicholasBlaskey merged commit 7a13863 into main Feb 13, 2024
73 of 86 checks passed

NicholasBlaskey deleted the perf_improve_get_experiments branch February 13, 2024 15:51

maxrussell pushed a commit that referenced this pull request Mar 21, 2024

perf: improve GetExperiments + SearchExperiments counting (#8801)

69ef4bd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: improve GetExperiments + SearchExperiments counting #8801

perf: improve GetExperiments + SearchExperiments counting #8801

NicholasBlaskey commented Feb 5, 2024 •

edited

Loading

netlify bot commented Feb 5, 2024 •

edited

Loading

codecov bot commented Feb 5, 2024 •

edited

Loading

NicholasBlaskey commented Feb 5, 2024 •

edited

Loading

NicholasBlaskey commented Feb 5, 2024

perf: improve GetExperiments + SearchExperiments counting #8801

perf: improve GetExperiments + SearchExperiments counting #8801

Conversation

NicholasBlaskey commented Feb 5, 2024 • edited Loading

Description

Test Plan

Commentary (optional)

Checklist

Ticket

netlify bot commented Feb 5, 2024 • edited Loading

✅ Deploy Preview for determined-ui canceled.

codecov bot commented Feb 5, 2024 • edited Loading

Codecov Report

NicholasBlaskey commented Feb 5, 2024 • edited Loading

NicholasBlaskey commented Feb 5, 2024

NicholasBlaskey commented Feb 5, 2024 •

edited

Loading

netlify bot commented Feb 5, 2024 •

edited

Loading

codecov bot commented Feb 5, 2024 •

edited

Loading

NicholasBlaskey commented Feb 5, 2024 •

edited

Loading