Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: improve GetExperiments + SearchExperiments counting #8801

Merged
merged 4 commits into from
Feb 13, 2024

Conversation

NicholasBlaskey
Copy link
Contributor

@NicholasBlaskey NicholasBlaskey commented Feb 5, 2024

Description

GetExperiments + SearchExperiments take a long time to count the experiments.

I think partly the introduction of the trials view really messed with the optimizer on this.

Test Plan

integration tests

Commentary (optional)

Checklist

  • Changes have been manually QA'd
  • User-facing API changes need the "User-facing API Change" label.
  • Release notes should be added as a separate file under docs/release-notes/.
    See Release Note for details.
  • Licenses should be included for new code which was copied and/or modified from any external code.

Ticket

@cla-bot cla-bot bot added the cla-signed label Feb 5, 2024
Copy link

netlify bot commented Feb 5, 2024

Deploy Preview for determined-ui canceled.

Name Link
🔨 Latest commit 5a87208
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/65cb88aa1bb77f0007283cf4

Copy link

codecov bot commented Feb 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (d8d9965) 47.69% compared to head (5a87208) 47.69%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #8801   +/-   ##
=======================================
  Coverage   47.69%   47.69%           
=======================================
  Files        1065     1065           
  Lines      169612   169611    -1     
  Branches     2240     2238    -2     
=======================================
+ Hits        80901    80903    +2     
+ Misses      88553    88550    -3     
  Partials      158      158           
Flag Coverage Δ
backend 43.47% <100.00%> (-0.01%) ⬇️
harness 64.12% <ø> (+<0.01%) ⬆️
web 42.55% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
master/internal/api_experiment.go 54.76% <100.00%> (-0.03%) ⬇️
master/internal/experiment_filter.go 95.91% <100.00%> (ø)

... and 3 files with indirect coverage changes

@NicholasBlaskey
Copy link
Contributor Author

NicholasBlaskey commented Feb 5, 2024

New pagination count for experiments-search

explain analyze SELECT count(*) FROM experiments as e LEFT JOIN users u ON e.owner_id = u.id LEFT JOIN projects p ON e.project_id = p.id LEFT JOIN workspaces w ON p.workspace_id = w.id LEFT JOIN runs AS r ON r.id = e.best_trial_id
;
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=6538.56..6538.57 rows=1 width=8) (actual time=16.379..16.380 rows=1 loops=1)
   ->  Seq Scan on experiments e  (cost=0.00..6432.05 rows=42605 width=0) (actual time=0.010..14.411 rows=42605 loops=1)
 Planning Time: 0.131 ms
 Execution Time: 16.404 ms
(4 rows)

@NicholasBlaskey
Copy link
Contributor Author

Old pagination count for experiments-search

                                                                                                    QUERY PLAN                                                                                                     
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=37071.25..37071.26 rows=1 width=8) (actual time=386.638..390.547 rows=1 loops=1)                                                                                  
   ->  Gather  (cost=37071.04..37071.25 rows=2 width=8) (actual time=386.630..390.542 rows=3 loops=1)                                                                                        
         Workers Planned: 2                                                                                                                                                                  
         Workers Launched: 2                                                                                                                                                                 
         ->  Partial Aggregate  (cost=36071.04..36071.05 rows=1 width=8) (actual time=375.230..375.240 rows=1 loops=3)                                                                       
               ->  Hash Join  (cost=29420.43..36026.66 rows=17752 width=0) (actual time=327.340..371.486 rows=14202 loops=3)                                                                 
                     Hash Cond: (p.workspace_id = w.id)                                                                                                                                      
                     ->  Hash Join  (cost=29419.36..35939.50 rows=17752 width=4) (actual time=327.215..364.856 rows=14202 loops=3)                                                                                
                           Hash Cond: (e.project_id = p.id)                                                                                                                                                       
                           ->  Hash Join  (cost=29418.25..35852.29 rows=17752 width=4) (actual time=327.189..358.015 rows=14202 loops=3)                                                                          
                                 Hash Cond: (e.owner_id = u.id)                                                                                                                                                                              
                                 ->  Parallel Hash Left Join  (cost=29403.96..35790.56 rows=17752 width=8) (actual time=327.067..351.530 rows=14202 loops=3)                                                                                 
                                       Hash Cond: (e.best_trial_id = t.run_id)                                                                                                                                                               
                                       ->  Parallel Hash Left Join  (cost=14701.98..20987.04 rows=17752 width=12) (actual time=161.228..177.258 rows=14202 loops=3)
                                             Hash Cond: (e.best_trial_id = t_1.run_id)
                                             ->  Parallel Seq Scan on experiments e  (cost=0.00..6183.52 rows=17752 width=12) (actual time=0.007..7.922 rows=14202 loops=3)
                                             ->  Parallel Hash  (cost=12905.18..12905.18 rows=143744 width=4) (actual time=159.716..159.719 rows=81455 loops=3)
                                                   Buckets: 262144  Batches: 1  Memory Usage: 11680kB
                                                   ->  Parallel Hash Join  (cost=8990.40..12905.18 rows=143744 width=4) (actual time=55.842..128.631 rows=81455 loops=3)
                                                         Hash Cond: (t_1.run_id = r_1.id)
                                                         ->  Parallel Seq Scan on trials_v2 t_1  (cost=0.00..3537.44 rows=143744 width=4) (actual time=0.004..12.519 rows=81455 loops=3)
                                                         ->  Parallel Hash  (cost=7725.91..7725.91 rows=101159 width=4) (actual time=55.696..55.697 rows=81455 loops=3)
                                                               Buckets: 262144  Batches: 1  Memory Usage: 11648kB
                                                               ->  Parallel Index Only Scan using trials_pkey on runs r_1  (cost=0.42..7725.91 rows=101159 width=4) (actual time=1.171..22.921 rows=81455 loops=3)
                                                                     Heap Fetches: 0
                                       ->  Parallel Hash  (cost=12905.18..12905.18 rows=143744 width=4) (actual time=165.075..165.077 rows=81455 loops=3)
                                             Buckets: 262144  Batches: 1  Memory Usage: 11680kB
                                             ->  Parallel Hash Join  (cost=8990.40..12905.18 rows=143744 width=4) (actual time=58.237..129.305 rows=81455 loops=3)
                                                   Hash Cond: (t.run_id = r.id)
                                                   ->  Parallel Seq Scan on trials_v2 t  (cost=0.00..3537.44 rows=143744 width=4) (actual time=0.005..11.109 rows=81455 loops=3)
                                                   ->  Parallel Hash  (cost=7725.91..7725.91 rows=101159 width=4) (actual time=56.154..56.154 rows=81455 loops=3)
                                                         Buckets: 262144  Batches: 1  Memory Usage: 11648kB
                                                         ->  Parallel Index Only Scan using trials_pkey on runs r  (cost=0.42..7725.91 rows=101159 width=4) (actual time=0.016..24.312 rows=81455 loops=3)
                                                               Heap Fetches: 0
                                 ->  Hash  (cost=11.35..11.35 rows=235 width=4) (actual time=0.107..0.107 rows=235 loops=3)
                                       Buckets: 1024  Batches: 1  Memory Usage: 17kB
                                       ->  Seq Scan on users u  (cost=0.00..11.35 rows=235 width=4) (actual time=0.008..0.060 rows=235 loops=3)
                           ->  Hash  (cost=1.05..1.05 rows=5 width=8) (actual time=0.011..0.012 rows=5 loops=3)
                                 Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                 ->  Seq Scan on projects p  (cost=0.00..1.05 rows=5 width=8) (actual time=0.007..0.008 rows=5 loops=3)
                     ->  Hash  (cost=1.03..1.03 rows=3 width=4) (actual time=0.020..0.021 rows=3 loops=3)
                           Buckets: 1024  Batches: 1  Memory Usage: 9kB
                           ->  Seq Scan on workspaces w  (cost=0.00..1.03 rows=3 width=4) (actual time=0.011..0.012 rows=3 loops=3)
 Planning Time: 0.944 ms
 Execution Time: 390.619 ms
(45 rows)

@NicholasBlaskey NicholasBlaskey marked this pull request as ready for review February 5, 2024 22:27
@NicholasBlaskey NicholasBlaskey requested a review from a team as a code owner February 5, 2024 22:27
@NicholasBlaskey NicholasBlaskey merged commit 7a13863 into main Feb 13, 2024
73 of 86 checks passed
@NicholasBlaskey NicholasBlaskey deleted the perf_improve_get_experiments branch February 13, 2024 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants