Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go-ipfs on gateways gets extremely slow #6564

Closed
hsanjuan opened this issue Aug 12, 2019 · 7 comments
Closed

go-ipfs on gateways gets extremely slow #6564

hsanjuan opened this issue Aug 12, 2019 · 7 comments
Labels
kind/support A question or request for support topic/gateway Topic gateway

Comments

@hsanjuan
Copy link
Contributor

Opening an issue to track a problem we have recently seen:

  • Go-ipfs gets really slow, simple cli commands usually hang for minutes, however no signs of increased CPU usage, disk usage, memory usage, goroutines or FDs.

Enable logging proceeds in general batches where a bunch of logs come out, a few seconds pass, and another bunch of logs come out etc. In these logs we see mainly:

  • Lots of very similar bitswap errors. Possibly trying to repeatedly obtain blocks from places that don't support the protocol?
Aug 12 13:19:24 gateway-bank1-ewr1 ipfs[16988]: 2019-08-12 13:16:50.752356 INFO bitswap messagequeue.go:192: cant open message sender to peer 12D3KooWGsS5CUHZWXQGbwZ8Qd4TtEwdWi8tRaSVEQApKJjuxGB3: protocol not supported
Aug 12 13:19:24 gateway-bank1-ewr1 ipfs[16988]: 2019-08-12 13:16:50.752381 INFO bitswap messagequeue.go:192: cant open message sender to peer 12D3KooWG2kvc7mV8HphyJKRs8tBmaYQvYAJz1Ax2oDF6UPQzvmd: protocol not supported
Aug 12 13:19:24 gateway-bank1-ewr1 ipfs[16988]: 2019-08-12 13:16:50.752476 INFO bitswap messagequeue.go:192: cant open message sender to peer 12D3KooWS28eHVC6QBYPr1AzL9CdZCfj1WBMoU8diSbYaAPJFdgu: protocol not supported
Aug 12 13:19:24 gateway-bank1-ewr1 ipfs[16988]: 2019-08-12 13:16:50.752650 INFO bitswap messagequeue.go:192: cant open message sender to peer 
  • Lots of very similar pubsub errors. Reconnecting to peers that are already connected (why?):
2 13:25:40.497456 WARNING pubsub pubsub.go:280: already have connection to peer:  Qmds8mpKNJKt5TQAFXEVviKajz9LTxKvcabNShD3YMSZw3
Aug 12 13:30:26 gateway-bank1-ewr1 ipfs[16988]: 2019-08-12 13:25:40.497477 WARNING pubsub pubsub.go:280: already have connection to peer:  QmRAiGtfiXzENxia6YiBG94UUQp7fbSaV6dVQj91zj2KQj
Aug 12 13:30:26 gateway-bank1-ewr1 ipfs[16988]: 2019-08-12 13:25:40.497495 WARNING pubsub pubsub.go:280: already have connection to peer:  QmcFs5sCaWKDnkhnLPeZqMp6nPmAsUVVTHCoQr2n6uPzkp
Aug 12 13:30:26 gateway-bank1-ewr1 ipfs[16988]: 2019-08-12 13:25:40.497513 WARNING pubsub pubsub.go:280: already have connection to peer:  
...
  • Lots of errors reading message from the dht facility (normal for hosts behind nat?):
13:54:15.746336 INFO dht dht_net.go:381: error reading message, bailing:  context canceled
Aug 12 13:57:18 gateway-bank1-ewr1 ipfs[16988]: 2019-08-12 13:54:15.746481 INFO dht dht_net.go:384: error reading message, trying again:  context canceled
Aug 12 13:57:18 gateway-bank1-ewr1 ipfs[16988]: 2019-08-12 13:54:15.746654 INFO dht dht_net.go:381: error reading message, bailing:  context canceled
Aug 12 13:57:18 gateway-bank1-ewr1 ipfs[16988]: 2019-08-12 13:54:15.746864 INFO dht dht_net.go:384: error reading message, trying again:  context canceled
Aug 12 13:57:18 gateway-bank1-ewr1 ipfs[16988]: 2019-08-12 13:54:15.747060 INFO dht dht_net.go:381: error reading message, bailing:  context canceled
Aug 12 13:57:18 gateway-bank1-ewr1 ipfs[16988]: 2019-08-12 13:54:15.747271 INFO dht dht_net.go:384: error reading message, trying again:  context canceled
Aug 12 13:57:18 gateway-bank1-ewr1 ipfs[16988]: 2019-08-12 13:54:15.747476 INFO dht dht_net.go:381: error reading message, bailing:  context canceled

What can cause contention so that the actual CLI stops working? (probably the API http server hanging when handling new requests? Without significant CPU usage increase that we could see in the logs?)

@lanzafame can you upload the logs you extracted (I can't atm)

@hsanjuan hsanjuan added the kind/support A question or request for support label Aug 12, 2019
@lanzafame
Copy link
Contributor

@Stebalien hsanjuan opened the issue for me 🙃

ipfs-debug-bank2-ewr1.tar.gz
ipfs-debug-bank1-ewr1.tar.gz

@lanzafame lanzafame added the topic/gateway Topic gateway label Aug 13, 2019
@hsanjuan
Copy link
Contributor Author

I am now experiencing similar issues in bank2-ewr1. This is the collected information:

QmVoPq1YiJE2ikd1BFjsnMJQGnqUARFLe1umj6HNvE8EST

It seems ipfs refuses to serve some requests for locally available data and just hangs

@Stebalien
Copy link
Member

Is that a gateway? It's running a GC. Although, it should serve read requests for that. Is the data already on the gateway or on cluster.

@Stebalien
Copy link
Member

I'm not seeing anything stuck reading from disk.

But wow, I'm seeing a bunch of stuff blocked on creating bitswap sessions.

@andrewheadricke
Copy link

andrewheadricke commented Sep 3, 2019

I'm noticing very poor performance from a lot of the major public gateways (including ipfs.io). Most unable to find content that is easily reachable on smaller gateways. I wonder if it has anything to do with this?

@Stebalien
Copy link
Member

@andrewheadricke #6383

@Stebalien
Copy link
Member

We're no longer seeing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support A question or request for support topic/gateway Topic gateway
Projects
None yet
Development

No branches or pull requests

4 participants