Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Add some tracing spans to give insight into local joins #13439

Merged
merged 2 commits into from
Aug 3, 2022

Conversation

H-Shay
Copy link
Contributor

@H-Shay H-Shay commented Aug 2, 2022

As part of the local joins investigation I'd like to add some tracing spans to get better insight into the timing of processing the /join request.

@H-Shay H-Shay requested a review from a team as a code owner August 2, 2022 20:11
state_event_ids=state_event_ids,
depth=depth,
)
with opentracing.start_active_span("update_membership_locked"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably give the Linearizer some tracing annotations at some point, which would probably make this redundant. but that's a bigger job and this is certainly fine for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what that job would entail?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really, much the same as you're doing here - adding with opentracing.start_active_span(...) calls around things that might be slow. I'd probably add one at the top level of Linearizer.queue (or rather, in the nested _ctx_manager), and another at the point where we wait for the lock (await new_defer).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was asked elsewhere but I didn't see the answer- is there any performance penalty or other drawback to consider when adding opentracing spans?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there's a small performance penalty, which is one reason not to go completely crazy with adding spans.

There are two aspects to the performance penalty. One, each call to start_active_span results in a few function calls and python objects. Two, if tracing is actually enabled for the current request then we additionally have to serialize all the trace data and send it over the network to the jaeger server, and obviously the more spans, the more work this is.

Plus, in some ways, less can be more when it comes to looking at a trace and understanding what's going on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks for explaining!

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@H-Shay H-Shay merged commit a648a06 into develop Aug 3, 2022
@H-Shay H-Shay deleted the shay/investigation branch August 3, 2022 17:19
@H-Shay
Copy link
Contributor Author

H-Shay commented Aug 3, 2022

Thanks! Merged.

@H-Shay H-Shay self-assigned this Aug 5, 2022
azmeuk pushed a commit to azmeuk/synapse that referenced this pull request Aug 8, 2022
Fizzadar added a commit to beeper/synapse-legacy-fork that referenced this pull request Aug 25, 2022
Synapse 1.65.0 (2022-08-16)
===========================

No significant changes since 1.65.0rc2.

Synapse 1.65.0rc2 (2022-08-11)
==============================

Internal Changes
----------------

- Revert 'Remove the unspecced `room_id` field in the `/hierarchy` response. ([\matrix-org#13365](matrix-org#13365))' to give more time for clients to update. ([\matrix-org#13501](matrix-org#13501))

Synapse 1.65.0rc1 (2022-08-09)
==============================

Features
--------

- Add support for stable prefixes for [MSC2285 (private read receipts)](matrix-org/matrix-spec-proposals#2285). ([\matrix-org#13273](matrix-org#13273))
- Add new unstable error codes `ORG.MATRIX.MSC3848.ALREADY_JOINED`, `ORG.MATRIX.MSC3848.NOT_JOINED`, and `ORG.MATRIX.MSC3848.INSUFFICIENT_POWER` described in [MSC3848](matrix-org/matrix-spec-proposals#3848). ([\matrix-org#13343](matrix-org#13343))
- Use stable prefixes for [MSC3827](matrix-org/matrix-spec-proposals#3827). ([\matrix-org#13370](matrix-org#13370))
- Add a new module API method to translate a room alias into a room ID. ([\matrix-org#13428](matrix-org#13428))
- Add a new module API method to create a room. ([\matrix-org#13429](matrix-org#13429))
- Add remote join capability to the module API's `update_room_membership` method (in a backwards compatible manner). ([\matrix-org#13441](matrix-org#13441))

Bugfixes
--------

- Update the version of the LDAP3 auth provider module included in the `matrixdotorg/synapse` DockerHub images and the Debian packages hosted on packages.matrix.org to 0.2.2. This version fixes a regression in the module. ([\matrix-org#13470](matrix-org#13470))
- Fix a bug introduced in Synapse v1.41.0 where the `/hierarchy` API returned non-standard information (a `room_id` field under each entry in `children_state`). ([\matrix-org#13365](matrix-org#13365))
- Fix a bug introduced in Synapse 0.24.0 that would respond with the wrong error status code to `/joined_members` requests when the requester is not a current member of the room. Contributed by @andrewdoh. ([\matrix-org#13374](matrix-org#13374))
- Fix bug in handling of typing events for appservices. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13392](matrix-org#13392))
- Fix a bug introduced in Synapse 1.57.0 where rooms listed in `exclude_rooms_from_sync` in the configuration file would not be properly excluded from incremental syncs. ([\matrix-org#13408](matrix-org#13408))
- Fix a bug in the experimental faster-room-joins support which could cause it to get stuck in an infinite loop. ([\matrix-org#13353](matrix-org#13353))
- Faster room joins: fix a bug which caused rejected events to become un-rejected during state syncing. ([\matrix-org#13413](matrix-org#13413))
- Faster room joins: fix error when running out of servers to sync partial state with, so that Synapse raises the intended error instead. ([\matrix-org#13432](matrix-org#13432))

Updates to the Docker image
---------------------------

- Make Docker images build on armv7 by installing cryptography dependencies in the 'requirements' stage. Contributed by Jasper Spaans. ([\matrix-org#13372](matrix-org#13372))

Improved Documentation
----------------------

- Update the 'registration tokens' page to acknowledge that the relevant MSC was merged into version 1.2 of the Matrix specification. Contributed by @moan0s. ([\matrix-org#11897](matrix-org#11897))
- Document which HTTP resources support gzip compression. ([\matrix-org#13221](matrix-org#13221))
- Add steps describing how to elevate an existing user to administrator by manipulating the database. ([\matrix-org#13230](matrix-org#13230))
- Fix wrong headline for `url_preview_accept_language` in documentation. ([\matrix-org#13437](matrix-org#13437))
- Remove redundant 'Contents' section from the Configuration Manual. Contributed by @dklimpel. ([\matrix-org#13438](matrix-org#13438))
- Update documentation for config setting `macaroon_secret_key`. ([\matrix-org#13443](matrix-org#13443))
- Update outdated information on `sso_mapping_providers` documentation. ([\matrix-org#13449](matrix-org#13449))
- Fix example code in module documentation of `password_auth_provider_callbacks`. ([\matrix-org#13450](matrix-org#13450))
- Make the configuration for the cache clearer. ([\matrix-org#13481](matrix-org#13481))

Internal Changes
----------------

- Extend the release script to automatically push a new SyTest branch, rather than having that be a manual process. ([\matrix-org#12978](matrix-org#12978))
- Make minor clarifications to the error messages given when we fail to join a room via any server. ([\matrix-org#13160](matrix-org#13160))
- Enable Complement CI tests in the 'latest deps' test run. ([\matrix-org#13213](matrix-org#13213))
- Fix long-standing bugged logic which was never hit in `get_pdu` asking every remote destination even after it finds an event. ([\matrix-org#13346](matrix-org#13346))
- Faster room joins: avoid blocking when pulling events with partially missing prev events. ([\matrix-org#13355](matrix-org#13355))
- Instrument `/messages` for understandable traces in Jaeger. ([\matrix-org#13368](matrix-org#13368))
- Remove an unused argument to `get_relations_for_event`. ([\matrix-org#13383](matrix-org#13383))
- Add a `merge-back` command to the release script, which automates merging the correct branches after a release. ([\matrix-org#13393](matrix-org#13393))
- Adding missing type hints to tests. ([\matrix-org#13397](matrix-org#13397))
- Faster Room Joins: don't leave a stuck room partial state flag if the join fails. ([\matrix-org#13403](matrix-org#13403))
- Refactor `_resolve_state_at_missing_prevs` to compute an `EventContext` instead. ([\matrix-org#13404](matrix-org#13404), [\matrix-org#13431](matrix-org#13431))
- Faster Room Joins: prevent Synapse from answering federated join requests for a room which it has not fully joined yet. ([\matrix-org#13416](matrix-org#13416))
- Re-enable running Complement tests against Synapse with workers. ([\matrix-org#13420](matrix-org#13420))
- Prevent unnecessary lookups to any external `get_event` cache. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13435](matrix-org#13435))
- Add some tracing to give more insight into local room joins. ([\matrix-org#13439](matrix-org#13439))
- Rename class `RateLimitConfig` to `RatelimitSettings` and `FederationRateLimitConfig` to `FederationRatelimitSettings`. ([\matrix-org#13442](matrix-org#13442))
- Add some comments about how event push actions are stored. ([\matrix-org#13445](matrix-org#13445), [\matrix-org#13455](matrix-org#13455))
- Improve rebuild speed for the "synapse-workers" docker image. ([\matrix-org#13447](matrix-org#13447))
- Fix `@tag_args` being off-by-one with the arguments when tagging a span (tracing). ([\matrix-org#13452](matrix-org#13452))
- Update type of `EventContext.rejected`. ([\matrix-org#13460](matrix-org#13460))
- Use literals in place of `HTTPStatus` constants in tests. ([\matrix-org#13463](matrix-org#13463), [\matrix-org#13469](matrix-org#13469))
- Correct a misnamed argument in state res v2 internals. ([\matrix-org#13467](matrix-org#13467))
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants