Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue] Mixnodes and gateway do not close connections properly #3187

Closed
simonwicky opened this issue Mar 15, 2023 · 0 comments · Fixed by #3190
Closed

[Issue] Mixnodes and gateway do not close connections properly #3187

simonwicky opened this issue Mar 15, 2023 · 0 comments · Fixed by #3190
Assignees
Labels
bug Something isn't working bug-needs-triage A bug that needs discussing and triage nym-binaries qa Quality Assurance
Milestone

Comments

@simonwicky
Copy link
Contributor

simonwicky commented Mar 15, 2023

Describe the issue
When the upstream node closes its connection to a node, the connection is not closed properly by it, and leaves an open file in the system. This eventually leads to refusing new connections and makes the node unusable. (

Err(err) => warn!("Failed to accept incoming connection - {err}"),
)
Capture d’écran 2023-03-15 à 09 19 54
This issue was observed both on a gateway, and on a mixnode from first layer.

Expected behaviour
Nodes should cleanly remove connections once they're closed.

Steps to Reproduce
In a loop, start and stop a gateway-client pair. Mixnodes from the first layer will eventually show a lot of open file upon running lsof | grep nym
Ex :

TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:57152 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:57548 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:33548 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:37892 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:54902 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:58410 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:53956 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:45876 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:53528 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:36474 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:35632 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:50766 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:37924 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:42244 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:33074 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:46786 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:51970 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:51316 (CLOSE_WAIT)
TCP li306-26.members.linode.com:1789->88-80-188-73.ip.linodeusercontent.com:46434 (CLOSE_WAIT)

Which area of Nym were you using?
Private testnet of 3*2 mix nodes, one gateway, one socks5-proxy and one network-requester, running on v1.1.12. (Github tag nym-binaries-v1.1.12)

Bug location
I believe this bug is caused by these two lines, not handling the case where the stream is closed and returns None.

Some(framed_sphinx_packet) = framed_conn.next() => {

Some(framed_sphinx_packet) = framed_conn.next() => {

Proposed fix
Here's a proposed bug fix, handling cases where the TCP stream of sphinx packets returns None, signalling the connection has been closed. This fix was briefly tested on the private testnet, correctly closing the leaking open file. It might not be exhaustive, some other places might need care as well.
This diff file was created on top of tag nym-binaries-v1.1.12
bugfix.txt

@simonwicky simonwicky added bug Something isn't working bug-needs-triage A bug that needs discussing and triage qa Quality Assurance labels Mar 15, 2023
jstuczyn added a commit that referenced this issue Mar 16, 2023
@jstuczyn jstuczyn self-assigned this Mar 16, 2023
@jstuczyn jstuczyn added this to the v1.1.14 milestone Mar 16, 2023
benedettadavico pushed a commit that referenced this issue Mar 23, 2023
* applied patch #3187

* applied the same concept to the verloc listener
simonwicky pushed a commit that referenced this issue Apr 5, 2023
* applied patch #3187

* applied the same concept to the verloc listener
simonwicky pushed a commit that referenced this issue Apr 5, 2023
* applied patch #3187

* applied the same concept to the verloc listener
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working bug-needs-triage A bug that needs discussing and triage nym-binaries qa Quality Assurance
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants