Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Federation was broken between my homeserver and matrix.org (and I fixed it with this patch) #14492

Closed
anonfloppa opened this issue Nov 18, 2022 · 8 comments
Assignees
Labels
A-Federation O-Occasional Affects or can be seen by some users regularly or most users rarely S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@anonfloppa
Copy link

Description

federation between my homeserver and matrix.org was broken. My homeserver was able to send messages but messages sent from matrix.org were never received. I was not the only server affected by this issue and another homeserver admin found a solution in #synapse-admins:

--- synapse/app/_base.py.2022118        2022-11-09 10:52:11.000000000 +0100
+++ synapse/app/_base.py        2022-11-18 00:38:43.054743699 +0100
@@ -650,7 +650,9 @@
     #
     # in short, we somewhat arbitrarily limit requests to 200 * 64K (about 12.5M)
     #
-    max_request_size = 200 * MAX_PDU_SIZE
+    #max_request_size = 200 * MAX_PDU_SIZE
+    max_request_size = 2000 * MAX_PDU_SIZE
 
     # if we have a media repo enabled, we may need to allow larger uploads than that
     if config.media.can_load_media_repo:

This patch worked for the original author and I confirm it is working for me too. I also made sure the parameter client_max_body_size in my nginx reverse-proxy was high enough. Increasing the value in nginx was not enough to fix the issue, the patch above was needed. I was affected by this bug since last weekend and I did a lot of tests by changing many values in nginx by increasing timeouts and size but nothing worked. I also tried changing synapse cache values and configs but that didn't do the trick either.

Steps to reproduce

I do not know how to reproduce the initial conditions that caused the bug but the patch fixed the federation between my server and matrix.org.

Homeserver

matrix.org was the server unable to send messages

Synapse Version

1.72.0rc1 for matrix and I run 1.71.0

Installation Method

Docker (matrixdotorg/synapse)

Platform

Ubuntu LTS and I use the Ansible playbook from https://github.com/spantaleev/matrix-docker-ansible-deploy

Relevant log output

here is an example of logs that I was seeing on my side:

Nov 17 21:46:19 localhost matrix-nginx-proxy[2836800]: 2022/11/18 02:46:19 [error] 30#30: *13315398 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 172.18.0.19, server: matrix-nginx-proxy, request: "PUT /_matrix/federation/v1/send/1668680027792 HTTP/1.0", upstream: "http://172.18.0.21:18111/_matrix/federation/v1/send/1668680027792", host: "matrix.anontier.nl"


here is a log that was submitted to me by a matrix developer:

2022-11-16 08:19:04,015 - synapse.http.matrixfederationclient - 672 - INFO - federation_transaction_transmission_loop-1891708-- - {PUT-O-1387436} [anontier.nl] Request failed: PUT matrix://anontier.nl/_matrix/federation/v1/send/1668471702441: HttpResponseException('502: Bad Gateway')
2022-11-16 08:20:01,940 - synapse.http.matrixfederationclient - 629 - INFO - federation_transaction_transmission_loop-1891708-- - {PUT-O-1387436} [anontier.nl] Got response headers: 502 Bad Gateway

Anything else that would be useful to know?

No response

@grinapo
Copy link

grinapo commented Nov 19, 2022

[The mentioned admin speaking]
Also worth mentioning that if your reverse proxy limits the maximum data size Synapse log won't even show the problem as the proxy will reject the connection (for example with 413: Request Entity Too Large).

The dying endpoint was _matrix/federation/v1/send/ and only the inbound side: when synapse backfills it retrieved the missing messages (but not the waiting extreme sized bundle).

See also #9817 commit 296a23f ...

@davidmehren
Copy link

We also ran into this issue, but not with Synapse limiting the request size to /_matrix/federation/v1/send/, but nginx. We also use matrix-docker-ansible-deploy, which sets client_max_body_size to 50 MB by default. So synapse seems to sometimes send federation requests bigger than 50 MB, which seems ... somewhat big to me? After increasing the limit in nginx to 100 MB, Synapse happily processes the incoming requests, at least as far as I can tell.

@davidmehren
Copy link

It turned out that my issue was more or less "user error", as matrix-docker-ansible-deploy allows 150 MB for federation requests in the default, recommended configuration (which I didn't use).

But I wonder if Synapse is supposed to make federation requests that are bigger than 50 megabytes, or if there is some kind of bug somewhere.

@DMRobertson DMRobertson self-assigned this Nov 21, 2022
@DMRobertson DMRobertson added S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. O-Occasional Affects or can be seen by some users regularly or most users rarely A-Federation labels Nov 22, 2022
@DMRobertson
Copy link
Contributor

But I wonder if Synapse is supposed to make federation requests that are bigger than 50 megabytes, or if there is some kind of bug somewhere.

The latter. Thanks for the report, everyone---we'll investigate.

@grinapo
Copy link

grinapo commented Nov 23, 2022

@davidmehren I didn't comment because you repeated what I just wrote about reverse proxy limit, but since you've mentioned "it was an user error" in #14513 I wanted to say that no, it wasn't. Sane reverse proxy limit is not an user error in general, regardless of whether in your case you and docker would have used a different limit (since it would have been hitting as well, maybe later).

@DMRobertson if you need any info (since we have seen both the source and the events) contact me [@grin:grin.hu], I'd rather not put the details it here.

@Cknight70
Copy link

I'm having this issue too

@DMRobertson
Copy link
Contributor

DMRobertson commented Dec 16, 2022

I'm having this issue too

@Cknight70 What is your homeserver's address? If you don't want to share publicly, I'm reachable at @DMRobertson:matrix.org

@DMRobertson
Copy link
Contributor

Thanks everyone for reporting, and apologies for a delay in an update to this issue.

But I wonder if Synapse is supposed to make federation requests that are bigger than 50 megabytes, or if there is some kind of bug somewhere.

There definitely was a bug. See GHSA-f3wc-3vxv-xmvr for details; the fix was in #14642.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Federation O-Occasional Affects or can be seen by some users regularly or most users rarely S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

No branches or pull requests

5 participants