Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: improve get_workspaces query #8751

Merged
merged 2 commits into from
Feb 2, 2024
Merged

Conversation

NicholasBlaskey
Copy link
Contributor

@NicholasBlaskey NicholasBlaskey commented Jan 25, 2024

Description

Improve performance of get_workspaces.sql.

The anon db revealed the query is taking way longer than it should.

This PR essentially undoes the optimization added in #5929

Test Plan

Commentary (optional)

Checklist

  • Changes have been manually QA'd
  • User-facing API changes need the "User-facing API Change" label.
  • Release notes should be added as a separate file under docs/release-notes/.
    See Release Note for details.
  • Licenses should be included for new code which was copied and/or modified from any external code.

Ticket

@cla-bot cla-bot bot added the cla-signed label Jan 25, 2024
Copy link

netlify bot commented Jan 25, 2024

Deploy Preview for determined-ui canceled.

Name Link
🔨 Latest commit fd0c779
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/65bced787be3c90008ed378f

Copy link

codecov bot commented Jan 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (e873381) 47.70% compared to head (fd0c779) 47.70%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #8751   +/-   ##
=======================================
  Coverage   47.70%   47.70%           
=======================================
  Files        1049     1049           
  Lines      167250   167250           
  Branches     2241     2241           
=======================================
+ Hits        79792    79794    +2     
+ Misses      87300    87298    -2     
  Partials      158      158           
Flag Coverage Δ
backend 43.16% <ø> (+<0.01%) ⬆️
harness 64.32% <ø> (ø)
web 42.54% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

see 5 files with indirect coverage changes

@NicholasBlaskey
Copy link
Contributor Author

NicholasBlaskey commented Jan 25, 2024

This is the old query plan against latest main (notice the seq scan on experiments)

                                                                       QUERY PLAN                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=517.63..520.50 rows=1151 width=239) (actual time=9.636..9.648 rows=80 loops=1)
   Sort Key: w.id, pins.created_at DESC
   Sort Method: quicksort  Memory: 36kB
   ->  GroupAggregate  (cost=418.82..459.11 rows=1151 width=239) (actual time=9.112..9.548 rows=80 loops=1)
         Group Key: w.id, u.username, pins.id
         ->  Sort  (cost=418.82..421.70 rows=1151 width=242) (actual time=8.314..8.494 rows=1263 loops=1)
               Sort Key: w.id, u.username, pins.id
               Sort Method: quicksort  Memory: 219kB
               ->  Hash Left Join  (cost=14.04..360.30 rows=1151 width=242) (actual time=0.378..4.554 rows=1263 loops=1)
                     Hash Cond: (w.id = pins.workspace_id)
                     ->  Hash Left Join  (cost=12.19..355.24 rows=1151 width=230) (actual time=0.248..3.984 rows=1263 loops=1)
                           Hash Cond: (w.user_id = u.id)
                           ->  Hash Right Join  (cost=9.93..349.76 rows=1151 width=222) (actual time=0.188..3.430 rows=1263 loops=1)
                                 Hash Cond: (p.workspace_id = w.id)
                                 ->  Hash Right Join  (cost=7.49..344.10 rows=1151 width=12) (actual time=0.097..2.753 rows=1234 loops=1)
                                       Hash Cond: (e.project_id = p.id)
                                       ->  Seq Scan on experiments e  (cost=0.00..333.51 rows=1151 width=8) (actual time=0.003..2.149 rows=1152 loops=1)
                                       ->  Hash  (cost=5.55..5.55 rows=155 width=8) (actual time=0.071..0.072 rows=129 loops=1)
                                             Buckets: 1024  Batches: 1  Memory Usage: 14kB
                                             ->  Seq Scan on projects p  (cost=0.00..5.55 rows=155 width=8) (actual time=0.005..0.041 rows=129 loops=1)
                                 ->  Hash  (cost=1.64..1.64 rows=64 width=214) (actual time=0.073..0.073 rows=80 loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 14kB
                                       ->  Seq Scan on workspaces w  (cost=0.00..1.64 rows=64 width=214) (actual time=0.014..0.038 rows=80 loops=1)
                           ->  Hash  (cost=1.56..1.56 rows=56 width=12) (actual time=0.035..0.036 rows=66 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 12kB
                                 ->  Seq Scan on users u  (cost=0.00..1.56 rows=56 width=12) (actual time=0.007..0.018 rows=66 loops=1)
                     ->  Hash  (cost=1.84..1.84 rows=1 width=16) (actual time=0.111..0.112 rows=16 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 9kB
                           ->  Seq Scan on workspace_pins pins  (cost=0.00..1.84 rows=1 width=16) (actual time=0.008..0.017 rows=16 loops=1)
                                 Filter: (user_id = 1)
                                 Rows Removed by Filter: 66
 Planning Time: 1.943 ms
 Execution Time: 9.921 ms
(33 rows)

This is running against anon db

      QUERY PLAN                                                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Incremental Sort  (cost=15494.76..15969.79 rows=705 width=245) (actual time=59.206..59.213 rows=3 loops=1)
   Sort Key: w.id, pins.created_at DESC
   Presorted Key: w.id
   Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 25kB  Peak Memory: 25kB
   ->  GroupAggregate  (cost=15262.12..15915.29 rows=705 width=245) (actual time=58.843..59.205 rows=3 loops=1)
         Group Key: w.id, u.username, pins.id
         ->  Sort  (cost=15262.12..15368.63 rows=42605 width=248) (actual time=38.549..44.343 rows=42605 loops=1)
               Sort Key: w.id, u.username, pins.id
               Sort Method: external merge  Disk: 2592kB
               ->  Hash Right Join  (cost=15.70..7033.56 rows=42605 width=248) (actual time=0.105..24.227 rows=42605 loops=1)
                     Hash Cond: (e.project_id = p.id)
                     ->  Seq Scan on experiments e  (cost=0.00..6432.05 rows=42605 width=8) (actual time=0.003..13.656 rows=42605 loops=1)
                     ->  Hash  (cost=15.63..15.63 rows=5 width=244) (actual time=0.099..0.104 rows=5 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 9kB
                           ->  Nested Loop Left Join  (cost=14.46..15.63 rows=5 width=244) (actual time=0.093..0.102 rows=5 loops=1)
                                 Join Filter: (pins.workspace_id = w.id)
                                 ->  Merge Left Join  (cost=14.46..14.53 rows=5 width=232) (actual time=0.088..0.094 rows=5 loops=1)
                                       Merge Cond: (w.id = p.workspace_id)
                                       ->  Sort  (cost=13.35..13.36 rows=3 width=228) (actual time=0.080..0.083 rows=3 loops=1)
                                             Sort Key: w.id
                                             Sort Method: quicksort  Memory: 25kB
                                             ->  Hash Right Join  (cost=1.07..13.33 rows=3 width=228) (actual time=0.054..0.077 rows=3 loops=1)
                                                   Hash Cond: (u.id = w.user_id)
                                                   ->  Seq Scan on users u  (cost=0.00..11.35 rows=235 width=18) (actual time=0.004..0.041 rows=235 loops=1)
                                                   ->  Hash  (cost=1.03..1.03 rows=3 width=214) (actual time=0.012..0.013 rows=3 loops=1)
                                                         Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                         ->  Seq Scan on workspaces w  (cost=0.00..1.03 rows=3 width=214) (actual time=0.008..0.009 rows=3 loops=1)
                                       ->  Sort  (cost=1.11..1.12 rows=5 width=8) (actual time=0.005..0.006 rows=5 loops=1)
                                             Sort Key: p.workspace_id
                                             Sort Method: quicksort  Memory: 25kB
                                             ->  Seq Scan on projects p  (cost=0.00..1.05 rows=5 width=8) (actual time=0.003..0.004 rows=5 loops=1)
                                 ->  Materialize  (cost=0.00..1.03 rows=1 width=16) (actual time=0.001..0.001 rows=0 loops=5)
                                       ->  Seq Scan on workspace_pins pins  (cost=0.00..1.02 rows=1 width=16) (actual time=0.003..0.003 rows=0 loops=1)
                                             Filter: (user_id = 1)
                                             Rows Removed by Filter: 2
 Planning Time: 0.464 ms
 Execution Time: 59.787 ms
(37 rows)

@NicholasBlaskey
Copy link
Contributor Author

Updated on latest-main

                                                                                QUERY PLAN                                                                                
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=3579.45..3579.61 rows=64 width=239) (actual time=4.315..4.325 rows=80 loops=1)
   Sort Key: w.id, pins.created_at DESC
   Sort Method: quicksort  Memory: 36kB
   ->  HashAggregate  (cost=6.59..3577.53 rows=64 width=239) (actual time=0.232..4.275 rows=80 loops=1)
         Group Key: w.id, u.username, pins.id
         ->  Hash Left Join  (cost=4.11..6.11 rows=64 width=234) (actual time=0.088..0.131 rows=80 loops=1)
               Hash Cond: (w.id = pins.workspace_id)
               ->  Hash Left Join  (cost=2.26..4.08 rows=64 width=222) (actual time=0.076..0.106 rows=80 loops=1)
                     Hash Cond: (w.user_id = u.id)
                     ->  Seq Scan on workspaces w  (cost=0.00..1.64 rows=64 width=214) (actual time=0.034..0.040 rows=80 loops=1)
                     ->  Hash  (cost=1.56..1.56 rows=56 width=12) (actual time=0.037..0.038 rows=66 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 12kB
                           ->  Seq Scan on users u  (cost=0.00..1.56 rows=56 width=12) (actual time=0.008..0.018 rows=66 loops=1)
               ->  Hash  (cost=1.84..1.84 rows=1 width=16) (actual time=0.008..0.009 rows=0 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 8kB
                     ->  Seq Scan on workspace_pins pins  (cost=0.00..1.84 rows=1 width=16) (actual time=0.008..0.008 rows=0 loops=1)
                           Filter: (user_id = 20)
                           Rows Removed by Filter: 82
         SubPlan 1
           ->  Aggregate  (cost=5.95..5.96 rows=1 width=8) (actual time=0.022..0.022 rows=1 loops=80)
                 ->  Seq Scan on projects  (cost=0.00..5.94 rows=3 width=0) (actual time=0.015..0.021 rows=2 loops=80)
                       Filter: (workspace_id = w.id)
                       Rows Removed by Filter: 127
         SubPlan 2
           ->  Aggregate  (cost=49.81..49.82 rows=1 width=8) (actual time=0.028..0.028 rows=1 loops=80)
                 ->  Nested Loop  (cost=0.28..49.76 rows=22 width=0) (actual time=0.018..0.026 rows=14 loops=80)
                       ->  Seq Scan on projects projects_1  (cost=0.00..5.94 rows=3 width=4) (actual time=0.011..0.016 rows=2 loops=80)
                             Filter: (workspace_id = w.id)
                             Rows Removed by Filter: 127
                       ->  Index Only Scan using ix_experiments_project_id on experiments  (cost=0.28..14.37 rows=24 width=4) (actual time=0.003..0.004 rows=9 loops=129)
                             Index Cond: (project_id = projects_1.id)
                             Heap Fetches: 202
 Planning Time: 0.790 ms
 Execution Time: 4.533 ms
(34 rows)

Updated on anon db

                                                                                 QUERY PLAN                                                                                  
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Incremental Sort  (cost=356.52..1040.82 rows=3 width=245) (actual time=8.474..8.478 rows=3 loops=1)
   Sort Key: w.id, pins.created_at DESC
   Presorted Key: w.id
   Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 25kB  Peak Memory: 25kB
   ->  Group  (cost=14.43..1040.69 rows=3 width=245) (actual time=8.286..8.470 rows=3 loops=1)
         Group Key: w.id, u.username, pins.id
         ->  Sort  (cost=14.43..14.43 rows=3 width=240) (actual time=0.083..0.086 rows=3 loops=1)
               Sort Key: w.id, u.username, pins.id
               Sort Method: quicksort  Memory: 25kB
               ->  Nested Loop Left Join  (cost=1.07..14.40 rows=3 width=240) (actual time=0.057..0.080 rows=3 loops=1)
                     Join Filter: (pins.workspace_id = w.id)
                     ->  Hash Right Join  (cost=1.07..13.33 rows=3 width=228) (actual time=0.053..0.074 rows=3 loops=1)
                           Hash Cond: (u.id = w.user_id)
                           ->  Seq Scan on users u  (cost=0.00..11.35 rows=235 width=18) (actual time=0.004..0.038 rows=235 loops=1)
                           ->  Hash  (cost=1.03..1.03 rows=3 width=214) (actual time=0.012..0.012 rows=3 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                 ->  Seq Scan on workspaces w  (cost=0.00..1.03 rows=3 width=214) (actual time=0.007..0.009 rows=3 loops=1)
                     ->  Materialize  (cost=0.00..1.03 rows=1 width=16) (actual time=0.001..0.001 rows=0 loops=3)
                           ->  Seq Scan on workspace_pins pins  (cost=0.00..1.02 rows=1 width=16) (actual time=0.003..0.003 rows=0 loops=1)
                                 Filter: (user_id = 20)
                                 Rows Removed by Filter: 2
         SubPlan 1
           ->  Aggregate  (cost=1.06..1.07 rows=1 width=8) (actual time=0.004..0.004 rows=1 loops=3)
                 ->  Seq Scan on projects  (cost=0.00..1.06 rows=1 width=0) (actual time=0.001..0.002 rows=2 loops=3)
                       Filter: (workspace_id = w.id)
                       Rows Removed by Filter: 3
         SubPlan 2
           ->  Aggregate  (cost=340.98..340.99 rows=1 width=8) (actual time=2.786..2.786 rows=1 loops=3)
                 ->  Nested Loop  (cost=0.29..319.68 rows=8521 width=0) (actual time=0.008..2.132 rows=14202 loops=3)
                       ->  Seq Scan on projects projects_1  (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.001 rows=2 loops=3)
                             Filter: (workspace_id = w.id)
                             Rows Removed by Filter: 3
                       ->  Index Only Scan using ix_experiments_project_id on experiments  (cost=0.29..233.41 rows=8521 width=4) (actual time=0.005..0.628 rows=8521 loops=5)
                             Index Cond: (project_id = projects_1.id)
                             Heap Fetches: 0
 Planning Time: 0.387 ms
 Execution Time: 8.541 ms
(37 rows)

@NicholasBlaskey NicholasBlaskey marked this pull request as ready for review January 25, 2024 14:53
@NicholasBlaskey NicholasBlaskey requested a review from a team as a code owner January 25, 2024 14:53
@NicholasBlaskey NicholasBlaskey enabled auto-merge (squash) February 2, 2024 13:27
@NicholasBlaskey NicholasBlaskey merged commit e0e6cf0 into main Feb 2, 2024
69 of 84 checks passed
@NicholasBlaskey NicholasBlaskey deleted the perf_get_workspaces branch February 2, 2024 13:40
maxrussell pushed a commit that referenced this pull request Mar 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants