Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Optimize filter_events_for_client for faster /messages - v1 #14494

Conversation

MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Nov 19, 2022

Optimize filter_events_for_client. See #14494 (comment) for SQL query optimization talk. Seems like it will make it a magnitude faster.

Fix #14108

Part of #13356 (comment)

Dev notes

Previous optimizations:


Grabbing all of the state via get_state_for_events is the longest part of filter_events_for_client

  • filter_events_for_client
    • get_state_for_events
      • get_state_group_for_events
        • _get_state_group_for_events
      • _get_state_for_groups
        • _get_state_for_groups_using_cache
        • _get_state_groups_from_groups 🐢
          • _get_state_groups_from_groups_txn (db._get_state_groups_from_groups)
      • get_events
        • ...

Of all the state we grab in filter_events_for_client, we only ever use EventTypes.RoomHistoryVisibility and EventTypes.Member from it

  • filter_events_for_client
  • _check_client_allowed_to_see_event
    • _check_filter_send_to_client
      • ->Uses no state
    • get_effective_room_visibility_from_state
      • -> EventTypes.RoomHistoryVisibility 🎈
    • _check_history_visibility
      • -> Uses visibility result from above
    • _check_membership
      • -> Uses visibility result from above
      • -> EventTypes.Member 🎈

Relevant database tables:

  • state_groups
  • state_groups_state
  • state_group_edges
  • event_to_state_groups

Docs:


This thing that is part of _get_state_groups_from_groups_txn seems suspicious:

# Temporarily disable sequential scans in this transaction. This is
# a temporary hack until we can add the right indices in
txn.execute("SET LOCAL enable_seqscan=off")


WITH RECURSIVE state(state_group) AS (
    VALUES(1088::bigint)
    UNION ALL
    SELECT prev_state_group FROM state_group_edges e, state s
    WHERE s.state_group = e.state_group
)
SELECT DISTINCT ON (type, state_key)
    type, state_key, event_id
FROM state_groups_state
WHERE
    state_group IN (
        SELECT state_group FROM state
    )
    AND ((type = 'm.room.history_visibility' AND state_key = '') OR (type = 'm.room.member' AND state_key = '@erictroupetester:my.synapse.server'))
ORDER BY type, state_key, state_group DESC;
WITH RECURSIVE sgs(state_group) AS (
    VALUES(1088::bigint)
    UNION ALL
    SELECT prev_state_group FROM state_group_edges e, sgs s
    WHERE s.state_group = e.state_group
)
SELECT
    state_group, type, state_key, event_id
FROM state_groups_state
WHERE
    state_group IN (
        SELECT state_group FROM sgs
    )
    AND ((type = 'm.room.history_visibility' AND state_key = '') OR (type = 'm.room.member' AND state_key = '@erictroupetester:my.synapse.server'))
ORDER BY type, state_key, state_group DESC;

 state_group |           type            |              state_key              |                   event_id
-------------+---------------------------+-------------------------------------+----------------------------------------------
           7 | m.room.history_visibility |                                     | $e2uT1YAwZQvMvXG0Ee5mqGyQ1KqVLH7JNhE88QqAnXY
        1088 | m.room.member             | @erictroupetester:my.synapse.server | $fO0oJZQ4rL2UiE7vDYo_uKuNrAmnRESrFhsRNIGFAh8
        1062 | m.room.member             | @erictroupetester:my.synapse.server | $QzoIckuimPX4Y8QZaMgxC5XB45ChAC1RZ43N3Y2FXOQ
        1036 | m.room.member             | @erictroupetester:my.synapse.server | $7-qHsVnP-5zKgeKujvKz1LHfVVEVln3y_GTDwS9f228
        1010 | m.room.member             | @erictroupetester:my.synapse.server | $9RsIrIeoBA-X4loFFOKNHOnqaDm8bNKGzXRhcp6kNuA
         984 | m.room.member             | @erictroupetester:my.synapse.server | $vHsSOWAdn9E6pOLLrO6vnB6TJXCdfyQd0uWl6gMftsE
         957 | m.room.member             | @erictroupetester:my.synapse.server | $V3A1GXrisvM0vY8tRR68e_cayLVbV-9zzkTCQpmXs84
           3 | m.room.member             | @erictroupetester:my.synapse.server | $uTRdQa2r7LgK17YRs-m5DWfkmcsoXgkBAo8If4pcVuA

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Pull request includes a sign off
  • Code style is correct
    (run the linters)

@MadLittleMods MadLittleMods added T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) labels Nov 19, 2022
@@ -130,9 +142,10 @@ def allowed(event: EventBase) -> Optional[EventBase]:
sender_erased=erased_senders.get(event.sender, False),
)

Copy link
Contributor Author

@MadLittleMods MadLittleMods Nov 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the state_groups table exist?

$ psql synapse
# \d+ state_groups
                                Table "public.state_groups"
  Column  |  Type  | Collation | Nullable | Default | Storage  | Stats target | Description
----------+--------+-----------+----------+---------+----------+--------------+-------------
 id       | bigint |           | not null |         | plain    |              |
 room_id  | text   |           | not null |         | extended |              |
 event_id | text   |           | not null |         | extended |              |

It seems like a subset of state_groups_state

$ psql synapse
# \d+ state_groups_state
                               Table "public.state_groups_state"
   Column    |  Type  | Collation | Nullable | Default | Storage  | Stats target | Description
-------------+--------+-----------+----------+---------+----------+--------------+-------------
 state_group | bigint |           | not null |         | plain    |              |
 room_id     | text   |           | not null |         | extended |              |
 type        | text   |           | not null |         | extended |              |
 state_key   | text   |           | not null |         | extended |              |
 event_id    | text   |           | not null |         | extended |              |

Comment on lines 306 to 325
WITH RECURSIVE sgs(state_group) AS (
VALUES(?::bigint)
UNION ALL
SELECT prev_state_group FROM state_group_edges e, sgs s
WHERE s.state_group = e.state_group
)
SELECT
type, state_key, event_id
FROM state_groups_state
WHERE
state_group IN (
SELECT state_group FROM sgs
)
AND (type = ? AND state_key = ?)
ORDER BY
type,
state_key,
-- Use the lastest state in the chain (highest numbered state_group in the chain)
state_group DESC
LIMIT 1
Copy link
Contributor Author

@MadLittleMods MadLittleMods Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before

One query that takes ~80ms:

Before raw query and EXPLAIN ANALYZE
EXPLAIN ANALYZE WITH RECURSIVE state(state_group) AS (
    VALUES(739988088::bigint)
    UNION ALL
    SELECT prev_state_group FROM state_group_edges e, state s
    WHERE s.state_group = e.state_group
)
SELECT DISTINCT ON (type, state_key)
    type, state_key, event_id
FROM state_groups_state
WHERE
    state_group IN (
        SELECT state_group FROM state
    )
    AND ((type = 'm.room.history_visibility' AND state_key = '') OR (type = 'm.room.member' AND state_key = '@madlittlemods:matrix.org'))
ORDER BY type, state_key, state_group DESC;
                                                                                          QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=5274.51..5274.52 rows=1 width=87) (actual time=76.555..76.561 rows=2 loops=1)
   CTE state
     ->  Recursive Union  (cost=0.00..284.28 rows=101 width=8) (actual time=0.002..0.443 rows=59 loops=1)
           ->  Result  (cost=0.00..0.01 rows=1 width=8) (actual time=0.001..0.002 rows=1 loops=1)
           ->  Nested Loop  (cost=0.57..28.23 rows=10 width=8) (actual time=0.006..0.007 rows=1 loops=59)
                 ->  WorkTable Scan on state s  (cost=0.00..0.20 rows=10 width=8) (actual time=0.000..0.000 rows=1 loops=59)
                 ->  Index Only Scan using state_group_edges_unique_idx on state_group_edges e  (cost=0.57..2.79 rows=1 width=16) (actual time=0.006..0.006 rows=1 loops=59)
                       Index Cond: (state_group = s.state_group)
                       Heap Fetches: 58
   ->  Sort  (cost=4990.23..4990.23 rows=1 width=87) (actual time=76.554..76.555 rows=2 loops=1)
         Sort Key: state_groups_state.type, state_groups_state.state_key, state_groups_state.state_group DESC
         Sort Method: quicksort  Memory: 25kB
         ->  Nested Loop  (cost=3.11..4990.22 rows=1 width=87) (actual time=0.740..76.542 rows=2 loops=1)
               ->  HashAggregate  (cost=2.27..3.28 rows=101 width=8) (actual time=0.484..0.498 rows=59 loops=1)
                     Group Key: state.state_group
                     Batches: 1  Memory Usage: 24kB
                     ->  CTE Scan on state  (cost=0.00..2.02 rows=101 width=8) (actual time=0.004..0.459 rows=59 loops=1)
               ->  Index Scan using state_groups_state_type_idx on state_groups_state  (cost=0.84..49.37 rows=1 width=87) (actual time=0.009..1.289 rows=0 loops=59)
                     Index Cond: (state_group = state.state_group)
                     Filter: (((type = 'm.room.history_visibility'::text) AND (state_key = ''::text)) OR ((type = 'm.room.member'::text) AND (state_key = '@madlittlemods:matrix.org'::text)))
                     Rows Removed by Filter: 1495
 Planning Time: 3.309 ms
 Execution Time: 76.629 ms
(23 rows)

Time: 80.624 ms

After

Two queries that each take about ~4ms

After raw query (for membership) and EXPLAIN ANALYZE
EXPLAIN ANALYZE WITH RECURSIVE sgs(state_group) AS (
    VALUES(739988088::bigint)
    UNION ALL
    SELECT prev_state_group FROM state_group_edges e, sgs s
    WHERE s.state_group = e.state_group
)
SELECT
    state_group, type, state_key, event_id
FROM state_groups_state
WHERE
    state_group IN (
        SELECT state_group FROM sgs
    )
    AND (type = 'm.room.member' AND state_key = '@madlittlemods:matrix.org')
ORDER BY
    type,
    state_key,
    -- Use the lastest state in the chain
    state_group DESC
LIMIT 1;
                                                                                 QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=779.70..779.70 rows=1 width=87) (actual time=1.248..1.251 rows=1 loops=1)
   CTE sgs
     ->  Recursive Union  (cost=0.00..284.28 rows=101 width=8) (actual time=0.003..0.653 rows=59 loops=1)
           ->  Result  (cost=0.00..0.01 rows=1 width=8) (actual time=0.001..0.002 rows=1 loops=1)
           ->  Nested Loop  (cost=0.57..28.23 rows=10 width=8) (actual time=0.009..0.010 rows=1 loops=59)
                 ->  WorkTable Scan on sgs s  (cost=0.00..0.20 rows=10 width=8) (actual time=0.000..0.000 rows=1 loops=59)
                 ->  Index Only Scan using state_group_edges_unique_idx on state_group_edges e  (cost=0.57..2.79 rows=1 width=16) (actual time=0.008..0.009 rows=1 loops=59)
                       Index Cond: (state_group = s.state_group)
                       Heap Fetches: 58
   ->  Sort  (cost=495.42..495.42 rows=1 width=87) (actual time=1.246..1.247 rows=1 loops=1)
         Sort Key: state_groups_state.state_group DESC
         Sort Method: quicksort  Memory: 25kB
         ->  Nested Loop  (cost=3.11..495.41 rows=1 width=87) (actual time=0.925..1.239 rows=1 loops=1)
               ->  HashAggregate  (cost=2.27..3.28 rows=101 width=8) (actual time=0.717..0.729 rows=59 loops=1)
                     Group Key: sgs.state_group
                     Batches: 1  Memory Usage: 24kB
                     ->  CTE Scan on sgs  (cost=0.00..2.02 rows=101 width=8) (actual time=0.005..0.680 rows=59 loops=1)
               ->  Index Scan using state_groups_state_type_idx on state_groups_state  (cost=0.84..4.86 rows=1 width=87) (actual time=0.008..0.008 rows=0 loops=59)
                     Index Cond: ((state_group = sgs.state_group) AND (type = 'm.room.member'::text) AND (state_key = '@madlittlemods:matrix.org'::text))
 Planning Time: 2.611 ms
 Execution Time: 1.326 ms
(21 rows)

Time: 4.614 ms
After raw query (for history visibility) and EXPLAIN ANALYZE
WITH RECURSIVE sgs(state_group) AS (
    VALUES(739988088::bigint)
    UNION ALL
    SELECT prev_state_group FROM state_group_edges e, sgs s
    WHERE s.state_group = e.state_group
)
SELECT
    state_group, type, state_key, event_id
FROM state_groups_state
WHERE
    state_group IN (
        SELECT state_group FROM sgs
    )
    AND (type = 'm.room.history_visibility' AND state_key = '')
ORDER BY
    type,
    state_key,
    -- Use the lastest state in the chain
    state_group DESC
LIMIT 1;

                                                                                 QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=779.70..779.70 rows=1 width=87) (actual time=0.961..0.964 rows=1 loops=1)
   CTE sgs
     ->  Recursive Union  (cost=0.00..284.28 rows=101 width=8) (actual time=0.003..0.458 rows=59 loops=1)
           ->  Result  (cost=0.00..0.01 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1)
           ->  Nested Loop  (cost=0.57..28.23 rows=10 width=8) (actual time=0.007..0.007 rows=1 loops=59)
                 ->  WorkTable Scan on sgs s  (cost=0.00..0.20 rows=10 width=8) (actual time=0.000..0.000 rows=1 loops=59)
                 ->  Index Only Scan using state_group_edges_unique_idx on state_group_edges e  (cost=0.57..2.79 rows=1 width=16) (actual time=0.006..0.006 rows=1 loops=59)
                       Index Cond: (state_group = s.state_group)
                       Heap Fetches: 58
   ->  Sort  (cost=495.42..495.42 rows=1 width=87) (actual time=0.960..0.961 rows=1 loops=1)
         Sort Key: state_groups_state.state_group DESC
         Sort Method: quicksort  Memory: 25kB
         ->  Nested Loop  (cost=3.11..495.41 rows=1 width=87) (actual time=0.675..0.953 rows=1 loops=1)
               ->  HashAggregate  (cost=2.27..3.28 rows=101 width=8) (actual time=0.503..0.514 rows=59 loops=1)
                     Group Key: sgs.state_group
                     Batches: 1  Memory Usage: 24kB
                     ->  CTE Scan on sgs  (cost=0.00..2.02 rows=101 width=8) (actual time=0.004..0.476 rows=59 loops=1)
               ->  Index Scan using state_groups_state_type_idx on state_groups_state  (cost=0.84..4.86 rows=1 width=87) (actual time=0.007..0.007 rows=0 loops=59)
                     Index Cond: ((state_group = sgs.state_group) AND (type = 'm.room.history_visibility'::text) AND (state_key = ''::text))
 Planning Time: 1.648 ms
 Execution Time: 1.018 ms
(21 rows)

Time: 3.280 ms

Copy link
Contributor Author

@MadLittleMods MadLittleMods Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have exact details to be confident in why this is faster but my thinking was we will be trying to do less by matching quicker because of one criteria at a time and LIMIT 1 where it can bail early after one successful state_groups_state match. And not doing any DISTINCT weirdness that probably keeps track of values.

Plus the query planner shows us that we're not throwing away 88k of rows filtering them away anymore.

We still recurse the entire state_event_edges chain but that is probably quick since there are no joins in it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh, that is very interesting. I think what is going on here is that the OR condition means that the query planner is no longer just pulling out the specific type/state_key pairs it needs, its instead pulling out the entire state group and filtering in memory. This has significant impact for rooms with large state groups, obviously.

One alternate proposal is instead of using OR conditions is to instead do UNION, e.g.:

EXPLAIN ANALYZE WITH RECURSIVE sgs(state_group) AS (
    VALUES(739988088::bigint)
    UNION ALL
    SELECT prev_state_group FROM state_group_edges e, sgs s
    WHERE s.state_group = e.state_group
)
(
    SELECT DISTINCT ON (type, state_key)
        state_group, type, state_key, event_id
    FROM state_groups_state
    INNER JOIN sgs USING (state_group)
    WHERE (type = 'm.room.member' AND state_key = '@madlittlemods:matrix.org')
    ORDER BY type, state_key, state_group DESC
) UNION (
    SELECT DISTINCT ON (type, state_key)
        state_group, type, state_key, event_id
    FROM state_groups_state
    INNER JOIN sgs USING (state_group)
    WHERE (type = 'm.room.history_visibility' AND state_key = '')
    ORDER BY type, state_key, state_group DESC
);

This has the advantage of being a single query, and reusing the work for pulling out the state edges.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can actually _get_state_groups_from_groups_txn implementation to do the above if we get a state filter? Would likely be a much smaller change?

Copy link
Contributor Author

@MadLittleMods MadLittleMods Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erikjohnston Have you tested your query? It does seem like an improvement (~50ms) over develop but takes longer than the proposed above and about ~20ms after Postgres cache is warm still.

https://explain.depesz.com/s/cyaR

Raw query plan
                                                                                      QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=1246.10..1246.61 rows=51 width=104) (actual time=28.398..28.403 rows=0 loops=1)
   Group Key: state_groups_state.state_group, state_groups_state.type, state_groups_state.state_key, state_groups_state.event_id
   Batches: 1  Memory Usage: 24kB
   CTE sgs
     ->  Recursive Union  (cost=0.00..515.03 rows=101 width=8) (actual time=0.003..1.594 rows=1 loops=1)
           ->  Result  (cost=0.00..0.01 rows=1 width=8) (actual time=0.001..0.002 rows=1 loops=1)
           ->  Nested Loop  (cost=0.28..51.30 rows=10 width=8) (actual time=1.590..1.590 rows=0 loops=1)
                 ->  WorkTable Scan on sgs s  (cost=0.00..0.20 rows=10 width=8) (actual time=0.001..0.002 rows=1 loops=1)
                 ->  Index Only Scan using state_group_edges_unique_idx on state_group_edges e  (cost=0.28..5.10 rows=1 width=16) (actual time=1.585..1.585 rows=0 loops=1)
                       Index Cond: (state_group = s.state_group)
                       Heap Fetches: 0
   ->  Append  (cost=0.28..730.56 rows=51 width=104) (actual time=28.392..28.395 rows=0 loops=1)
         ->  Unique  (cost=0.28..304.38 rows=1 width=95) (actual time=18.021..18.021 rows=0 loops=1)
               ->  Nested Loop  (cost=0.28..304.38 rows=1 width=95) (actual time=18.020..18.020 rows=0 loops=1)
                     Join Filter: (state_groups_state.state_group = sgs.state_group)
                     ->  Index Scan Backward using state_groups_state_type_idx on state_groups_state  (cost=0.28..301.10 rows=1 width=95) (actual time=18.019..18.019 rows=0 loops=1)
                           Index Cond: ((type = 'm.room.member'::text) AND (state_key = '@madlittlemods:matrix.org'::text))
                     ->  CTE Scan on sgs  (cost=0.00..2.02 rows=101 width=8) (never executed)
         ->  Unique  (cost=425.28..425.41 rows=50 width=95) (actual time=10.369..10.371 rows=0 loops=1)
               ->  Sort  (cost=425.28..425.41 rows=50 width=95) (actual time=10.368..10.370 rows=0 loops=1)
                     Sort Key: state_groups_state_1.state_group DESC
                     Sort Method: quicksort  Memory: 25kB
                     ->  Hash Join  (cost=300.39..423.87 rows=50 width=95) (actual time=9.244..9.245 rows=0 loops=1)
                           Hash Cond: (state_groups_state_1.state_group = sgs_1.state_group)
                           ->  Bitmap Heap Scan on state_groups_state state_groups_state_1  (cost=297.11..419.88 rows=56 width=95) (actual time=0.921..5.960 rows=232 loops=1)
                                 Recheck Cond: ((type = 'm.room.history_visibility'::text) AND (state_key = ''::text))
                                 Heap Blocks: exact=62
                                 ->  Bitmap Index Scan on state_groups_state_type_idx  (cost=0.00..297.09 rows=56 width=0) (actual time=0.904..0.905 rows=232 loops=1)
                                       Index Cond: ((type = 'm.room.history_visibility'::text) AND (state_key = ''::text))
                           ->  Hash  (cost=2.02..2.02 rows=101 width=8) (actual time=2.140..2.141 rows=1 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                 ->  CTE Scan on sgs sgs_1  (cost=0.00..2.02 rows=101 width=8) (actual time=0.004..1.595 rows=1 loops=1)
 Planning Time: 25.952 ms
 Execution Time: 33.217 ms
(34 rows)

Time: 141.669 ms

Copy link
Member

@erikjohnston erikjohnston Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes only a few ms for me :/

Copy link
Contributor Author

@MadLittleMods MadLittleMods Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes only a few ms for me :/

It's possible I was testing on the wrong database (local vs matrix.org) (kinda hard to judge now that caches are warm eveywhere). We can at least see that it still has a higher cost and deals with more rows.

But I do really like the idea of it being a single query and re-using the state-edge chain!

Talking with @erikjohnston out of band, the plan is to go with the neat UNION solution proposed and I will get a PR for that up in the next few hours (hopefully a much smaller and clean diff for easy review).

Other exploration

Adding LIMIT 1 doesn't seem different and actually slightly higher "cost" (whatever that is worth at this small scale)

https://explain.depesz.com/s/1PQG

Query plan from LIMIT 1
EXPLAIN ANALYZE WITH RECURSIVE sgs(state_group) AS (
    VALUES(739988088::bigint)
    UNION ALL
    SELECT prev_state_group FROM state_group_edges e, sgs s
    WHERE s.state_group = e.state_group
)
(
    SELECT DISTINCT ON (type, state_key)
        state_group, type, state_key, event_id
    FROM state_groups_state
    INNER JOIN sgs USING (state_group)
    WHERE (type = 'm.room.member' AND state_key = '@madlittlemods:matrix.org')
    ORDER BY type, state_key, state_group DESC
    LIMIT 1
) UNION (
    SELECT DISTINCT ON (type, state_key)
        state_group, type, state_key, event_id
    FROM state_groups_state
    INNER JOIN sgs USING (state_group)
    WHERE (type = 'm.room.history_visibility' AND state_key = '')
    ORDER BY type, state_key, state_group DESC
    LIMIT 1
);
                                                                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=1272.63..1272.66 rows=2 width=104) (actual time=1.541..1.550 rows=2 loops=1)
   CTE sgs
     ->  Recursive Union  (cost=0.00..284.28 rows=101 width=8) (actual time=0.003..0.563 rows=59 loops=1)
           ->  Result  (cost=0.00..0.01 rows=1 width=8) (actual time=0.001..0.002 rows=1 loops=1)
           ->  Nested Loop  (cost=0.57..28.23 rows=10 width=8) (actual time=0.008..0.008 rows=1 loops=59)
                 ->  WorkTable Scan on sgs s  (cost=0.00..0.20 rows=10 width=8) (actual time=0.000..0.000 rows=1 loops=59)
                 ->  Index Only Scan using state_group_edges_unique_idx on state_group_edges e  (cost=0.57..2.79 rows=1 width=16) (actual time=0.007..0.007 rows=1 loops=59)
                       Index Cond: (state_group = s.state_group)
                       Heap Fetches: 58
   ->  Sort  (cost=988.35..988.36 rows=2 width=104) (actual time=1.540..1.543 rows=2 loops=1)
         Sort Key: state_groups_state.state_group, state_groups_state.type, state_groups_state.state_key, state_groups_state.event_id
         Sort Method: quicksort  Memory: 25kB
         ->  Append  (cost=494.15..988.34 rows=2 width=104) (actual time=1.159..1.534 rows=2 loops=1)
               ->  Limit  (cost=494.15..494.16 rows=1 width=87) (actual time=1.158..1.160 rows=1 loops=1)
                     ->  Unique  (cost=494.15..494.16 rows=1 width=87) (actual time=1.157..1.158 rows=1 loops=1)
                           ->  Sort  (cost=494.15..494.16 rows=1 width=87) (actual time=1.156..1.157 rows=1 loops=1)
                                 Sort Key: state_groups_state.state_group DESC
                                 Sort Method: quicksort  Memory: 25kB
                                 ->  Nested Loop  (cost=0.84..494.14 rows=1 width=87) (actual time=1.140..1.151 rows=1 loops=1)
                                       ->  CTE Scan on sgs  (cost=0.00..2.02 rows=101 width=8) (actual time=0.005..0.591 rows=59 loops=1)
                                       ->  Index Scan using state_groups_state_type_idx on state_groups_state  (cost=0.84..4.86 rows=1 width=87) (actual time=0.009..0.009 rows=0 loops=59)
                                             Index Cond: ((state_group = sgs.state_group) AND (type = 'm.room.member'::text) AND (state_key = '@madlittlemods:matrix.org'::text))
               ->  Limit  (cost=494.15..494.16 rows=1 width=87) (actual time=0.370..0.371 rows=1 loops=1)
                     ->  Unique  (cost=494.15..494.16 rows=1 width=87) (actual time=0.370..0.370 rows=1 loops=1)
                           ->  Sort  (cost=494.15..494.16 rows=1 width=87) (actual time=0.369..0.370 rows=1 loops=1)
                                 Sort Key: state_groups_state_1.state_group DESC
                                 Sort Method: quicksort  Memory: 25kB
                                 ->  Nested Loop  (cost=0.84..494.14 rows=1 width=87) (actual time=0.364..0.366 rows=1 loops=1)
                                       ->  CTE Scan on sgs sgs_1  (cost=0.00..2.02 rows=101 width=8) (actual time=0.001..0.012 rows=59 loops=1)
                                       ->  Index Scan using state_groups_state_type_idx on state_groups_state state_groups_state_1  (cost=0.84..4.86 rows=1 width=87) (actual time=0.006..0.006 rows=0 loops=59)
                                             Index Cond: ((state_group = sgs_1.state_group) AND (type = 'm.room.history_visibility'::text) AND (state_key = ''::text))
 Planning Time: 3.998 ms
 Execution Time: 1.640 ms
(33 rows)

Time: 6.662 ms

Getting rid of the DISTINCT at this point seems to make no difference either:

https://explain.depesz.com/s/eEW

Query plan from LIMIT 1 and no DISTINCT
EXPLAIN ANALYZE WITH RECURSIVE sgs(state_group) AS (
    VALUES(739988088::bigint)
    UNION ALL
    SELECT prev_state_group FROM state_group_edges e, sgs s
    WHERE s.state_group = e.state_group
)
(
    SELECT
        state_group, type, state_key, event_id
    FROM state_groups_state
    INNER JOIN sgs USING (state_group)
    WHERE (type = 'm.room.member' AND state_key = '@madlittlemods:matrix.org')
    ORDER BY type, state_key, state_group DESC
    LIMIT 1
) UNION (
    SELECT
        state_group, type, state_key, event_id
    FROM state_groups_state
    INNER JOIN sgs USING (state_group)
    WHERE (type = 'm.room.history_visibility' AND state_key = '')
    ORDER BY type, state_key, state_group DESC
    LIMIT 1
);
                                                                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=1272.63..1272.66 rows=2 width=104) (actual time=1.537..1.546 rows=2 loops=1)
   CTE sgs
     ->  Recursive Union  (cost=0.00..284.28 rows=101 width=8) (actual time=0.003..0.560 rows=59 loops=1)
           ->  Result  (cost=0.00..0.01 rows=1 width=8) (actual time=0.001..0.002 rows=1 loops=1)
           ->  Nested Loop  (cost=0.57..28.23 rows=10 width=8) (actual time=0.008..0.008 rows=1 loops=59)
                 ->  WorkTable Scan on sgs s  (cost=0.00..0.20 rows=10 width=8) (actual time=0.000..0.000 rows=1 loops=59)
                 ->  Index Only Scan using state_group_edges_unique_idx on state_group_edges e  (cost=0.57..2.79 rows=1 width=16) (actual time=0.007..0.007 rows=1 loops=59)
                       Index Cond: (state_group = s.state_group)
                       Heap Fetches: 58
   ->  Sort  (cost=988.35..988.36 rows=2 width=104) (actual time=1.536..1.539 rows=2 loops=1)
         Sort Key: state_groups_state.state_group, state_groups_state.type, state_groups_state.state_key, state_groups_state.event_id
         Sort Method: quicksort  Memory: 25kB
         ->  Append  (cost=494.15..988.34 rows=2 width=104) (actual time=1.144..1.529 rows=2 loops=1)
               ->  Limit  (cost=494.15..494.16 rows=1 width=87) (actual time=1.143..1.145 rows=1 loops=1)
                     ->  Sort  (cost=494.15..494.16 rows=1 width=87) (actual time=1.142..1.143 rows=1 loops=1)
                           Sort Key: state_groups_state.state_group DESC
                           Sort Method: quicksort  Memory: 25kB
                           ->  Nested Loop  (cost=0.84..494.14 rows=1 width=87) (actual time=1.125..1.136 rows=1 loops=1)
                                 ->  CTE Scan on sgs  (cost=0.00..2.02 rows=101 width=8) (actual time=0.005..0.592 rows=59 loops=1)
                                 ->  Index Scan using state_groups_state_type_idx on state_groups_state  (cost=0.84..4.86 rows=1 width=87) (actual time=0.009..0.009 rows=0 loops=59)
                                       Index Cond: ((state_group = sgs.state_group) AND (type = 'm.room.member'::text) AND (state_key = '@madlittlemods:matrix.org'::text))
               ->  Limit  (cost=494.15..494.16 rows=1 width=87) (actual time=0.380..0.381 rows=1 loops=1)
                     ->  Sort  (cost=494.15..494.16 rows=1 width=87) (actual time=0.380..0.380 rows=1 loops=1)
                           Sort Key: state_groups_state_1.state_group DESC
                           Sort Method: quicksort  Memory: 25kB
                           ->  Nested Loop  (cost=0.84..494.14 rows=1 width=87) (actual time=0.375..0.377 rows=1 loops=1)
                                 ->  CTE Scan on sgs sgs_1  (cost=0.00..2.02 rows=101 width=8) (actual time=0.000..0.013 rows=59 loops=1)
                                 ->  Index Scan using state_groups_state_type_idx on state_groups_state state_groups_state_1  (cost=0.84..4.86 rows=1 width=87) (actual time=0.006..0.006 rows=0 loops=59)
                                       Index Cond: ((state_group = sgs_1.state_group) AND (type = 'm.room.history_visibility'::text) AND (state_key = ''::text))
 Planning Time: 4.449 ms
 Execution Time: 1.632 ms
(31 rows)

Time: 7.024 ms

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in v2 -> #14527

Comment on lines +348 to +356
txn.execute(sql, (group, EventTypes.RoomHistoryVisibility, ""))
history_vis_info = txn.fetchone()
if history_vis_info is not None:
row_info_list.append(history_vis_info)

txn.execute(sql, (group, EventTypes.Member, user_id_viewing_events))
membership_info = txn.fetchone()
if membership_info is not None:
row_info_list.append(membership_info)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we can batch up these two individual queries to have less database round-trip time?

Is there a way we can batch up all the queries across all of the state groups?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see the UNION idea from @erikjohnston in #14494 (comment)

@MadLittleMods MadLittleMods changed the title Optimize filter_events_for_client Optimize filter_events_for_client for faster /messages Nov 22, 2022
Comment on lines 173 to +175
@cancellable
@trace
@tag_args
Copy link
Contributor Author

@MadLittleMods MadLittleMods Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should @cancellable be the innermost or outermost decorator?

I am guessing outermost but perhaps it doesn't matter

Comment on lines +207 to +208
# Since we're making decisions based on the state, we need to wait.
await_full_state = True
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this accurate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably the source of our TestPartialStateJoin failures in Complement (https://github.com/matrix-org/synapse/actions/runs/3520947916/jobs/5902365185#step:6:11950). What should I be doing here?

Comment on lines +113 to +118
# Grab the history visibility and membership for each of the events. That's all we
# need to know in order to filter them.
event_id_to_state = await storage.state._get_state_for_events_when_filtering_for_client(
# we exclude outliers at this point, and then handle them separately later
event_ids=non_outlier_event_ids,
user_id_viewing_events=user_id,
Copy link
Contributor Author

@MadLittleMods MadLittleMods Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we've changed to more individual queries, it's hard to just test all queries that would make up a real life transaction against the database and be 100% confident that everything is better. Can we test this against matrix.org somehow to verify it's an improvement before merging (deploy to a special client_reader worker that you can route my traffic to)?

event_id_to_state = await storage.state._get_state_for_events_when_filtering_for_client(
# we exclude outliers at this point, and then handle them separately later
event_ids=non_outlier_event_ids,
user_id_viewing_events=user_id,
Copy link
Contributor Author

@MadLittleMods MadLittleMods Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is slightly rushed here and I wish I could have explored more before putting this up for review but this does seem like an improvement and we need to get any change deployed today so I've marked as ready for review.

Let me know if this just isn't right to ship.

@MadLittleMods MadLittleMods marked this pull request as ready for review November 22, 2022 07:17
@MadLittleMods MadLittleMods requested a review from a team as a code owner November 22, 2022 07:17
@MadLittleMods MadLittleMods changed the title Optimize filter_events_for_client for faster /messages Optimize filter_events_for_client for faster /messages - v1 Nov 22, 2022
@MadLittleMods MadLittleMods marked this pull request as draft November 22, 2022 18:18
@MadLittleMods MadLittleMods removed the request for review from a team November 22, 2022 18:18
@MadLittleMods
Copy link
Contributor Author

Closing in favor of v2, #14527

@MadLittleMods MadLittleMods added the A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db label May 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

filter_events_for_client is time-consuming and expensive
2 participants