Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better error handling on Zulip #1589

Merged
merged 6 commits into from
Oct 23, 2021
Merged

Conversation

alexmv
Copy link
Contributor

@alexmv alexmv commented Sep 21, 2021

We noticed that matterbridge bots were making thousands of requests to Zulip servers with unauthenticated credentials. This is an attempt to stem that tide.

Specifically, it catches failures when registering, which were previously completely ignored -- most of the unauthenticated requests were to yourserver.zulipchat.com which is the example host in the configuration, which isn't currently a valid Zulip realm:

[zulip]
#You can configure multiple servers "[zulip.name]" or "[zulip.name2]"
#In this example we use [zulip.streamchat]
#REQUIRED
[zulip.streamchat]
#Token to connect with zulip API (called bot API key in Settings - Your bots)
#REQUIRED
Token="Yourtokenhere"
#Username of the bot, normally called yourbot-bot@yourserver.zulipchat.com
#See username in Settings - Your bots
#REQUIRED
Login="yourbot-bot@yourserver.zulipchat.com"
#Servername of your zulip instance
#REQUIRED
Server="https://yourserver.zulipchat.com"

It also catches the 401's when fetching events, which were not rate-limited previously.

Finally, it provides a more specific user-agent, which will make it easier to track down errors like this in the future. It does so by doing a small re-arrangement of packages to make the version information available via a package which is not main.

I made commits directly into vendor/github.com/matterbridge/gozulipbot/ directly, rather than open a PR against https://github.com/matterbridge/gozulipbot. It looks like the gozulipbot code in this repo has already diverged from the other repository, so it looked easier to fix it here. Regardless, however, the key flaws of failing to check for failure during queue registration and failing to rate-limit 401 responses also apply to the matterbridge/gozulipbot repository, and should also be applied there.

I'm happy to open a second PR there if you'd prefer; let me know how you'd like to proceed, as it looks like there are changes other than these which need backporting out of vendor/ and into matterbridge/gozulipbot.

@alexmv
Copy link
Contributor Author

alexmv commented Oct 4, 2021

Gentle poke on this?

@42wim
Copy link
Owner

42wim commented Oct 16, 2021

@alexmv Thank you very much! and sorry for the delay.

The gozulipbot vendor is uptodate, but it's against the work branch.

Could you make your PR against https://github.com/matterbridge/gozulipbot/tree/work ?
I'll update the matterbridge dependencies afterwards.

This PR can be closed then, let me know if that's okay for you.

An unknown error (including an unauthorized error) would fall through
with no calls to time.Sleep, resulting in hammering the server as
quickly as possible.

Add a 10-second sleep in the default error case.  The heartbeat is
left with no explicit sleep, but all other codepaths now contain one.
This will allow it to be accessed by other sections of the code.
@alexmv
Copy link
Contributor Author

alexmv commented Oct 19, 2021

I've pushed the gozulipbot-specific changes to matterbridge/gozulipbot#1; this PR shouldn't be closed, because there are changes that are not contained within the vendored gozulipbot.

I've rebased, dropping the gozulipbot-only changes, and re-pushed. This is not expected to pass tests until it is rebased past a commit which pulls in the gozulipbot changes.

@42wim 42wim added the zulip label Oct 23, 2021
@42wim 42wim added this to the 1.23.1 milestone Oct 23, 2021
@codeclimate
Copy link

codeclimate bot commented Oct 23, 2021

Code Climate has analyzed commit 8e4906a and detected 1 issue on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 1

View more on Code Climate.

@42wim 42wim merged commit e3ffbca into 42wim:master Oct 23, 2021
@42wim
Copy link
Owner

42wim commented Oct 23, 2021

Thanks!

@alexmv alexmv deleted the zulip-auth-failure branch November 25, 2021 05:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants